[x3d-public] X3D Regex
Andreas Plesch
andreasplesch at gmail.com
Thu Jun 7 07:34:30 PDT 2018
Let's say we do _not_ like leading zeroes, multiple commas, white
space after the sign, and 0X for a hexadecial prefix.
([ \t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([ \t]+|,))*
is then proposed.
In words:
Match zero (to accommodate the empty array) or more times all of the following:
optional leading white space (just space or tab), followed by
an optional + or minus sign, followed by
one of the following:
- a single digit (between 0 and 9, so including 0)
- a two or more digit number starting with a digit between 1 and 9
- a hexadecimal number starting with 0x and followed by at least one of either
- a digit
- a letter between a and f
- a letter between A and F
which is then followed by either
- one or more white space
- or a single comma
Testing with online tools looks good.
Some more items of interest:
http://www.web3d.org/pipermail/x3d-public_web3d.org/2012-March/001950.html
is a previous discussion on floating points regex.
http://www.web3d.org/specifications/X3dRegularExpressions.html claims that
The SFInt32 pattern is a native XML type:
<xs:restriction base="xs:integer"/>
https://www.w3.org/TR/xmlschema11-2/#integer is the definition.
The XML spec. says it is derived from 'decimal' so no hexadecimal
numbers. This probably needs to be fixed in
http://www.web3d.org/specifications/X3dRegularExpressions.html. There
does not seem to be a hexadecimal built-in XML data type, so probably
another regex is required. The other option is to deprecate
hexadecimal values for Int32. They are very rarely used anyways and
only provide a more compact representation once you get above 9999
(0x270F).
https://www.w3.org/TR/xmlschema11-2/#decimal actually also allows in
3.3.3.1 explicitly leading zeros indicating that the native XML types
may not be a good fit for X3D SFInt32 and therefore MFInt32.
-Andreas
On Thu, Jun 7, 2018 at 8:01 AM, Andreas Plesch <andreasplesch at gmail.com> wrote:
> Hi Don
>
> On Thu, Jun 7, 2018, 12:17 AM Don Brutzman <brutzman at nps.edu> wrote:
>>
>> great to see the dialog and scrutiny, thanks!
>>
>> intent is to allow legal/standard content but disallow (or at least
>> diagnose) broken/problematic/nonstandard content.
>
>
> I think it may be helpful to more definitely state what the function of the
> regex is:
>
> - flag certainly broken content but still allow questionable content
> - only allow well behaved content and flag questionable but parseable and
> possibly legal content
>
> There could be two versions, strict and lax.
>
>>
>> for MFInt32,
>>
>> - legal: 0 1 2 30 -0 -1 -2 -30
>>
>> - illegal: 010 -020 (leading zeroes also an indicator that intermediate
>> whitespace was dropped)
>
>
> The official regex /((\+|\-)?(0|[1-9][0-9]*)?( )?(,)?( )?)*/ currently
> allows leading zeros as legal, perhaps by accident.
>
>>
>> - single comma OK between numbers, but multiple commas an indicator that
>> an intermediate number was dropped
>
>
> The official regex currently allows multiple commas.
>
> Other patterns to clarify as being considered problematic include:
>
> - +0
> - -0
> - + 12 (white space after sign)
> - tab,newline,return as white space
> - capital letters for hex: 0xAF
> - 0X in addition to 0x as prefix for hexadecimal
>
>>
>> I have the O'Reilly books on this topic, they often give good/adaptable
>> "cookbook recipes" worth considering. can look further next week.
>>
>> it is helpful to be very wary of any conclusions whatsoever without
>> testing a regex.
>
>
> Oh yes.
>
>>
>>
>> if alternatives are found, great - we can test with online tools, with
>> Netbeans/X3D-Edit, and with regression testing of the X3D Examples Archive.
>
>
> It sounds like the current regex does not quite match expectations.
> The archive may not have leading zeros anywhere ? Or some of the other
> problematic patterns like multiple commas ?
>
> -Andreas
>
>>
>> all the best, Don
>> --
>> Don Brutzman Naval Postgraduate School, Code USW/Br
>> brutzman at nps.edu
>> Watkins 270, MOVES Institute, Monterey CA 93943-5000 USA
>> +1.831.656.2149
>> X3D graphics, virtual worlds, navy robotics
>> http://faculty.nps.edu/brutzman
>>
>
--
Andreas Plesch
Waltham, MA 02453
More information about the x3d-public
mailing list