<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">It was hard to follow the entire thread
and comment perhaps multiple times. I am trying to merge all of my
comments into one message, so I hope I've covered everything.<br>
<br>
For a validator, it _should_ exactly reflect the spec and not be
nice to hand coders. This doesn't mean that X3D browsers would
fail on some of the common expressions, but the validator should
not flag non-spec values as valid.<br>
<br>
In general, it is important to note that an empty string
(field="") is not the same as the field not being specified. It
may be the case for some or all fields that being empty and not
defined both produce the default value.<br>
<br>
For XML encoding
(<a class="moz-txt-link-freetext" href="http://www.web3d.org/documents/specifications/19776-1/V3.3/Part01/EncodingOfFields.html">http://www.web3d.org/documents/specifications/19776-1/V3.3/Part01/EncodingOfFields.html</a>),
section 5.1.2 discusses trailing white space only for MF fields.
Should it be inferred that leading whitespace for MF and leading
or trailing white space for SF is not allowed?<br>
<br>
The standard decimal regex for integers is /^[+-]?\d+$/<br>
<br>
This supports an optional leading +/- then one or more digits
[0-9].<br>
<br>
Hex numbers would be something like: /^0x[0-9a-fA-F]{1,8}$/<br>
<br>
Does the regex need to indicate the maximum number of digits? The
above hex one does, but the decimal one does not. These also
assume that white space is not allowed.<br>
<br>
As already noted, when dealing with MFInt32, the regex needs to
take the separate character(s) into account. The spec says that
the values are separated by white space (tabs are legal white
space) and commas may be used as whitespace.<br>
<br>
This gives something like: /^(SFINT[,\s]+)*$/<br>
<br>
Where SFINT is the regex for the individual SFINT value. This
regex does not allow leading white space. It requires one or more
white space characters (regular white space or comma), then the
basic unit is repeated (or not at all). This does not do an
integral match for the number of elements (to ensure that and an
n-tuple has the correct # of elements). Note that trailing white
space is allowed. Values such as '0,,,0 ' is legal and creates a
2-element array since commas are white space. This is not quite
spec compliant in that it allows commas after the last value and
the spec states "commas may only be used as whitespace between
values of individual base datatypes".<br>
<br>
<br>
Leonard Daly<br>
<br>
P.S. I did not test all of the regex's, so I may be slightly off.
The correct representation should match the intent and not be far
from what I have written.<br>
<br>
<br>
<br>
</div>
<blockquote type="cite"
cite="mid:CAKdk67sF3uF8Ha1+u2y_vpWGD_=8kuHfnmo9JsxgcM0PSrqVmA@mail.gmail.com">
<div dir="ltr">
<div><br>
</div>
<div class="gmail_quote">
<div dir="ltr">On Thu, Jun 7, 2018 at 10:34 AM Andreas Plesch
<<a href="mailto:andreasplesch@gmail.com"
moz-do-not-send="true">andreasplesch@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Let's say
we do _not_ like leading zeroes, multiple commas, white<br>
space after the sign, and 0X for a hexadecial prefix.<br>
<br>
([ \t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([
\t]+|,))* // obsolete<br>
<br>
is then proposed.<br>
<br>
</blockquote>
<div><br>
</div>
<div>Since above required a space or comma at the very end, we
need to add the end anchor ($) as being allowed after a
number:</div>
<div><br>
</div>
<div><span
style="background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">([
\t]*[-|+]?([0-9]|[1-9][0-9]+|(0x([0-9]|[a-f]|[A-F])+))([
\t]+|,|$))*</span><br>
</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
In words:<br>
<br>
Match zero (to accommodate the empty array) or more times
all of the following:<br>
optional leading white space (just space or tab), followed
by<br>
an optional + or minus sign, followed by<br>
one of the following:<br>
- a single digit (between 0 and 9, so including 0)<br>
- a two or more digit number starting with a digit between 1
and 9<br>
- a hexadecimal number starting with 0x and followed by at
least one of either<br>
- a digit<br>
- a letter between a and f<br>
- a letter between A and F<br>
which is then followed by either<br>
- one or more white space<br>
- or a single comma<br>
</blockquote>
<div><br>
</div>
<div>- or the end of the string</div>
<div><br>
</div>
<div>Sorry for the oversight. Let's give this a good workout,
-Andreas</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Testing with online tools looks good.<br>
<br>
Some more items of interest:<br>
<br>
<a
href="http://www.web3d.org/pipermail/x3d-public_web3d.org/2012-March/001950.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.web3d.org/pipermail/x3d-public_web3d.org/2012-March/001950.html</a><br>
is a previous discussion on floating points regex.<br>
<br>
<a
href="http://www.web3d.org/specifications/X3dRegularExpressions.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.web3d.org/specifications/X3dRegularExpressions.html</a>
claims that<br>
<br>
The SFInt32 pattern is a native XML type:<br>
<br>
<xs:restriction base="xs:integer"/><br>
<br>
<a href="https://www.w3.org/TR/xmlschema11-2/#integer"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.w3.org/TR/xmlschema11-2/#integer</a>
is the definition.<br>
<br>
The XML spec. says it is derived from 'decimal' so no
hexadecimal<br>
numbers. This probably needs to be fixed in<br>
<a
href="http://www.web3d.org/specifications/X3dRegularExpressions.html"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.web3d.org/specifications/X3dRegularExpressions.html</a>.
There<br>
does not seem to be a hexadecimal built-in XML data type, so
probably<br>
another regex is required. The other option is to deprecate<br>
hexadecimal values for Int32. They are very rarely used
anyways and<br>
only provide a more compact representation once you get
above 9999<br>
(0x270F).<br>
<br>
<a href="https://www.w3.org/TR/xmlschema11-2/#decimal"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.w3.org/TR/xmlschema11-2/#decimal</a>
actually also allows in<br>
3.3.3.1 explicitly leading zeros indicating that the native
XML types<br>
may not be a good fit for X3D SFInt32 and therefore MFInt32.<br>
<br>
-Andreas<br>
<br>
<br>
<br>
<br>
<br>
On Thu, Jun 7, 2018 at 8:01 AM, Andreas Plesch <<a
href="mailto:andreasplesch@gmail.com" target="_blank"
moz-do-not-send="true">andreasplesch@gmail.com</a>>
wrote:<br>
> Hi Don<br>
><br>
> On Thu, Jun 7, 2018, 12:17 AM Don Brutzman <<a
href="mailto:brutzman@nps.edu" target="_blank"
moz-do-not-send="true">brutzman@nps.edu</a>> wrote:<br>
>><br>
>> great to see the dialog and scrutiny, thanks!<br>
>><br>
>> intent is to allow legal/standard content but
disallow (or at least<br>
>> diagnose) broken/problematic/nonstandard content.<br>
><br>
><br>
> I think it may be helpful to more definitely state what
the function of the<br>
> regex is:<br>
><br>
> - flag certainly broken content but still allow
questionable content<br>
> - only allow well behaved content and flag questionable
but parseable and<br>
> possibly legal content<br>
><br>
> There could be two versions, strict and lax.<br>
><br>
>><br>
>> for MFInt32,<br>
>><br>
>> - legal: 0 1 2 30 -0 -1 -2 -30<br>
>><br>
>> - illegal: 010 -020 (leading zeroes also an
indicator that intermediate<br>
>> whitespace was dropped)<br>
><br>
><br>
> The official regex /((\+|\-)?(0|[1-9][0-9]*)?( )?(,)?(
)?)*/ currently<br>
> allows leading zeros as legal, perhaps by accident.<br>
><br>
>><br>
>> - single comma OK between numbers, but multiple
commas an indicator that<br>
>> an intermediate number was dropped<br>
><br>
><br>
> The official regex currently allows multiple commas.<br>
><br>
> Other patterns to clarify as being considered
problematic include:<br>
><br>
> - +0<br>
> - -0<br>
> - + 12 (white space after sign)<br>
> - tab,newline,return as white space<br>
> - capital letters for hex: 0xAF<br>
> - 0X in addition to 0x as prefix for hexadecimal<br>
><br>
>><br>
>> I have the O'Reilly books on this topic, they often
give good/adaptable<br>
>> "cookbook recipes" worth considering. can look
further next week.<br>
>><br>
>> it is helpful to be very wary of any conclusions
whatsoever without<br>
>> testing a regex.<br>
><br>
><br>
> Oh yes.<br>
><br>
>><br>
>><br>
>> if alternatives are found, great - we can test with
online tools, with<br>
>> Netbeans/X3D-Edit, and with regression testing of
the X3D Examples Archive.<br>
><br>
><br>
> It sounds like the current regex does not quite match
expectations.<br>
> The archive may not have leading zeros anywhere ? Or
some of the other<br>
> problematic patterns like multiple commas ?<br>
><br>
> -Andreas<br>
><br>
>><br>
>> all the best, Don<br>
>> --<br>
>> Don Brutzman Naval Postgraduate School, Code
USW/Br<br>
>> <a href="mailto:brutzman@nps.edu" target="_blank"
moz-do-not-send="true">brutzman@nps.edu</a><br>
>> Watkins 270, MOVES Institute, Monterey CA
93943-5000 USA<br>
>> +1.831.656.2149<br>
>> X3D graphics, virtual worlds, navy robotics<br>
>> <a href="http://faculty.nps.edu/brutzman"
rel="noreferrer" target="_blank" moz-do-not-send="true">http://faculty.nps.edu/brutzman</a><br>
>><br>
><br>
<br>
<br>
<br>
--<br>
Andreas Plesch<br>
Waltham, MA 02453<br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div>Andreas Plesch<br>
Waltham, MA 02453</div>
</div>
</div>
</div>
<!--'"--><br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
x3d-public mailing list
<a class="moz-txt-link-abbreviated" href="mailto:x3d-public@web3d.org">x3d-public@web3d.org</a>
<a class="moz-txt-link-freetext" href="http://web3d.org/mailman/listinfo/x3d-public_web3d.org">http://web3d.org/mailman/listinfo/x3d-public_web3d.org</a>
</pre>
</blockquote>
<p><br>
</p>
<div class="moz-signature">-- <br>
<font class="tahoma,arial,helvetica san serif" color="#333366">
<font size="+1"><b>Leonard Daly</b></font><br>
3D Systems & Cloud Consultant<br>
LA ACM SIGGRAPH Past Chair<br>
President, Daly Realism - <i>Creating the Future</i>
</font></div>
</body>
</html>