[x3d-public] X3D Regex

Andreas Plesch andreasplesch at gmail.com
Thu Jun 7 07:34:30 PDT 2018


Let's say we do _not_ like leading zeroes, multiple commas, white
space after the sign, and 0X for a hexadecial prefix.

([ \t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([ \t]+|,))*

is then proposed.

In words:

Match zero (to accommodate the empty array) or more times all of the following:
optional leading white space (just space or tab), followed by
an optional + or minus sign, followed by
one of the following:
- a single digit (between 0 and 9, so including 0)
- a two or more digit number starting with a digit between 1 and 9
 - a hexadecimal number starting with 0x and followed by at least one of either
  - a digit
  - a letter between a and f
  - a letter between A and F
which is then followed by either
 - one or more white space
 - or a single comma

Testing with online tools looks good.

Some more items of interest:

http://www.web3d.org/pipermail/x3d-public_web3d.org/2012-March/001950.html
is a previous discussion on floating points regex.

http://www.web3d.org/specifications/X3dRegularExpressions.html claims that

The SFInt32 pattern is a native XML type:

<xs:restriction base="xs:integer"/>

https://www.w3.org/TR/xmlschema11-2/#integer is the definition.

The XML spec. says it is derived from 'decimal' so no hexadecimal
numbers. This probably needs to be fixed in
http://www.web3d.org/specifications/X3dRegularExpressions.html. There
does not seem to be a hexadecimal built-in XML data type, so probably
another regex is required. The other option is to deprecate
hexadecimal values for Int32. They are very rarely used anyways and
only provide a more compact representation once you get above 9999
(0x270F).

https://www.w3.org/TR/xmlschema11-2/#decimal actually also allows in
3.3.3.1 explicitly leading zeros indicating that the native XML types
may not be a good fit for X3D SFInt32 and therefore MFInt32.

-Andreas





On Thu, Jun 7, 2018 at 8:01 AM, Andreas Plesch <andreasplesch at gmail.com> wrote:
> Hi Don
>
> On Thu, Jun 7, 2018, 12:17 AM Don Brutzman <brutzman at nps.edu> wrote:
>>
>> great to see the dialog and scrutiny, thanks!
>>
>> intent is to allow legal/standard content but disallow (or at least
>> diagnose) broken/problematic/nonstandard content.
>
>
> I think it may be helpful to more definitely state what the function of the
> regex is:
>
> - flag certainly broken content but still allow questionable content
> - only allow well behaved content and flag questionable but parseable and
> possibly legal content
>
> There could be two versions, strict and lax.
>
>>
>> for MFInt32,
>>
>> - legal: 0 1 2 30 -0 -1 -2 -30
>>
>> - illegal: 010 -020 (leading zeroes also an indicator that intermediate
>> whitespace was dropped)
>
>
> The official regex /((\+|\-)?(0|[1-9][0-9]*)?( )?(,)?( )?)*/  currently
> allows leading zeros as legal, perhaps by accident.
>
>>
>> - single comma OK between numbers, but multiple commas an indicator that
>> an intermediate number was dropped
>
>
> The official regex currently allows multiple commas.
>
> Other patterns to clarify as being considered problematic include:
>
> - +0
> - -0
> - + 12 (white space after sign)
> - tab,newline,return as white space
> - capital letters for hex: 0xAF
> - 0X in addition to 0x as prefix for hexadecimal
>
>>
>> I have the O'Reilly books on this topic, they often give good/adaptable
>> "cookbook recipes" worth considering.  can look further next week.
>>
>> it is helpful to be very wary of any conclusions whatsoever without
>> testing a regex.
>
>
> Oh yes.
>
>>
>>
>> if alternatives are found, great - we can test with online tools, with
>> Netbeans/X3D-Edit, and with regression testing of the X3D Examples Archive.
>
>
> It sounds like the current regex does not quite match expectations.
> The archive may not have leading zeros anywhere ? Or some of the other
> problematic patterns like multiple commas ?
>
> -Andreas
>
>>
>> all the best, Don
>> --
>> Don Brutzman  Naval Postgraduate School, Code USW/Br
>> brutzman at nps.edu
>> Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA
>> +1.831.656.2149
>> X3D graphics, virtual worlds, navy robotics
>> http://faculty.nps.edu/brutzman
>>
>



--
Andreas Plesch
Waltham, MA 02453



More information about the x3d-public mailing list