[x3d-public] X3D Regex

Andreas Plesch andreasplesch at gmail.com
Thu Jun 7 10:07:22 PDT 2018


On Thu, Jun 7, 2018 at 10:34 AM Andreas Plesch <andreasplesch at gmail.com>
wrote:

> Let's say we do _not_ like leading zeroes, multiple commas, white
> space after the sign, and 0X for a hexadecial prefix.
>
> ([ \t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([ \t]+|,))* //
> obsolete
>
> is then proposed.
>
>
Since above required a space or comma at the very end, we need to add the
end anchor ($) as being allowed after a number:

([ \t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([ \t]+|,|$))*



> In words:
>
> Match zero (to accommodate the empty array) or more times all of the
> following:
> optional leading white space (just space or tab), followed by
> an optional + or minus sign, followed by
> one of the following:
> - a single digit (between 0 and 9, so including 0)
> - a two or more digit number starting with a digit between 1 and 9
>  - a hexadecimal number starting with 0x and followed by at least one of
> either
>   - a digit
>   - a letter between a and f
>   - a letter between A and F
> which is then followed by either
>  - one or more white space
>  - or a single comma
>

- or the end of the string

Sorry for the oversight. Let's give this a good workout,  -Andreas


> Testing with online tools looks good.
>
> Some more items of interest:
>
> http://www.web3d.org/pipermail/x3d-public_web3d.org/2012-March/001950.html
> is a previous discussion on floating points regex.
>
> http://www.web3d.org/specifications/X3dRegularExpressions.html claims that
>
> The SFInt32 pattern is a native XML type:
>
> <xs:restriction base="xs:integer"/>
>
> https://www.w3.org/TR/xmlschema11-2/#integer is the definition.
>
> The XML spec. says it is derived from 'decimal' so no hexadecimal
> numbers. This probably needs to be fixed in
> http://www.web3d.org/specifications/X3dRegularExpressions.html. There
> does not seem to be a hexadecimal built-in XML data type, so probably
> another regex is required. The other option is to deprecate
> hexadecimal values for Int32. They are very rarely used anyways and
> only provide a more compact representation once you get above 9999
> (0x270F).
>
> https://www.w3.org/TR/xmlschema11-2/#decimal actually also allows in
> 3.3.3.1 explicitly leading zeros indicating that the native XML types
> may not be a good fit for X3D SFInt32 and therefore MFInt32.
>
> -Andreas
>
>
>
>
>
> On Thu, Jun 7, 2018 at 8:01 AM, Andreas Plesch <andreasplesch at gmail.com>
> wrote:
> > Hi Don
> >
> > On Thu, Jun 7, 2018, 12:17 AM Don Brutzman <brutzman at nps.edu> wrote:
> >>
> >> great to see the dialog and scrutiny, thanks!
> >>
> >> intent is to allow legal/standard content but disallow (or at least
> >> diagnose) broken/problematic/nonstandard content.
> >
> >
> > I think it may be helpful to more definitely state what the function of
> the
> > regex is:
> >
> > - flag certainly broken content but still allow questionable content
> > - only allow well behaved content and flag questionable but parseable and
> > possibly legal content
> >
> > There could be two versions, strict and lax.
> >
> >>
> >> for MFInt32,
> >>
> >> - legal: 0 1 2 30 -0 -1 -2 -30
> >>
> >> - illegal: 010 -020 (leading zeroes also an indicator that intermediate
> >> whitespace was dropped)
> >
> >
> > The official regex /((\+|\-)?(0|[1-9][0-9]*)?( )?(,)?( )?)*/  currently
> > allows leading zeros as legal, perhaps by accident.
> >
> >>
> >> - single comma OK between numbers, but multiple commas an indicator that
> >> an intermediate number was dropped
> >
> >
> > The official regex currently allows multiple commas.
> >
> > Other patterns to clarify as being considered problematic include:
> >
> > - +0
> > - -0
> > - + 12 (white space after sign)
> > - tab,newline,return as white space
> > - capital letters for hex: 0xAF
> > - 0X in addition to 0x as prefix for hexadecimal
> >
> >>
> >> I have the O'Reilly books on this topic, they often give good/adaptable
> >> "cookbook recipes" worth considering.  can look further next week.
> >>
> >> it is helpful to be very wary of any conclusions whatsoever without
> >> testing a regex.
> >
> >
> > Oh yes.
> >
> >>
> >>
> >> if alternatives are found, great - we can test with online tools, with
> >> Netbeans/X3D-Edit, and with regression testing of the X3D Examples
> Archive.
> >
> >
> > It sounds like the current regex does not quite match expectations.
> > The archive may not have leading zeros anywhere ? Or some of the other
> > problematic patterns like multiple commas ?
> >
> > -Andreas
> >
> >>
> >> all the best, Don
> >> --
> >> Don Brutzman  Naval Postgraduate School, Code USW/Br
> >> brutzman at nps.edu
> >> Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA
> >> +1.831.656.2149
> >> X3D graphics, virtual worlds, navy robotics
> >> http://faculty.nps.edu/brutzman
> >>
> >
>
>
>
> --
> Andreas Plesch
> Waltham, MA 02453
>


-- 
Andreas Plesch
Waltham, MA 02453
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20180607/c0ece4a3/attachment-0001.html>


More information about the x3d-public mailing list