[x3d-public] X3D regular expression (regex) improvements
Don Brutzman
brutzman at nps.edu
Sun Aug 19 22:26:26 PDT 2018
On 8/17/2018 8:06 PM, Andreas Plesch wrote:
> What about expanding the validation to multiple regexes which all need to match to check for illegal values where necessary ?
>
> first: !((0|0.0*|.0+)\s+){3})*
>
> second: actual SFRotation pattern
Relevant page section is now bookmarked, that possibility is also added as a TODO item.
X3D Regexes: Negative lookahead and disallowed values
http://www.web3d.org/specifications/X3dRegularExpressions.html#NegativeLookahead
Modifying the primary regexes is pretty simple, just insert the given "negative lookahead" block at the beginning.
> Can there be multiple regexes in the XML Schema for a type ?
Multiple special "simple types" are listed in X3D XML Schema and each can have a regex if we want.
X3D XML Schema x3d-4.0.xsd documentation
http://www.web3d.org/specifications/X3dSchemaDocumentation4.0/x3d-4.0.html
As it so happens, this weekend I cleaned up the expression for bboxSize type. Also added it to X3DUOM and X3DJSAIL. The regex can be found as part of the base type SFVec3fObject.
X3D Regular Expressions (regexes): bboxSize
http://www.web3d.org/specifications/X3dRegularExpressions.html#bboxSize
Online unit tests:
https://regex101.com/r/sjaPZq/1
Modified X3DUOM to include regexes in simple types, excerpted result:
<SimpleType name="bboxSizeType"
baseType="SFVec3f"
defaultValue="-1 -1 -1"
regex="\s*((([+]?(((0|[1-9][0-9]*)(\.[0-9]*)?|\.[0-9]+)([Ee][+-]?[0-9]+)?)\s+){2}([+]?((0|[1-9][0-9]*)(\.[0-9]*)?|\.[0-9]+)([Ee][+-]?[0-9]+)?)\s*)|(\-1(\.(0)*)?\s+\-1(\.(0)*)?\s+\-1(\.(0)*)?)\s*)?"
appinfo="bboxSizeType dimensions are non-negative values, default value (-1 -1 -1) indicates that no bounding box size has been computed."
documentation="http://www.web3d.org/documents/specifications/19775-1/V3.3/Part01/components/group.html#Boundingboxes"/>
also
X3DJSAIL: Utility Methods and Functionality, regexes
http://www.web3d.org/specifications/java/X3DJSAIL.html#regex
* Complete. X3D Regular Expression (regex) support in field types, for example SFVec3fObject REGEX string, pattern, validate(), matches() and matches(String value).
* Progressing. X3D Regular Expression (regex) support for special types, for example SFVec3fObject.matchesBboxSizeType(String value). TODO: add Matrix types.
http://www.web3d.org/specifications/java/javadoc/org/web3d/x3d/jsail/fields/SFVec3fObject.html#matchesBboxSizeType-java.lang.String-
Here is the current unit-testing code in FieldObjectTests.java
@Test
@DisplayName("Test SFVec3fObject bboxSizeType checks on single-field 3-tuple single-precision floating-point array")
void SFVec3fBboxSizeObjectTests()
{
System.out.println ("SFVec3fBboxSizeObjectTests for bounding box (bbox) constraints...");
float[] defaultBboxSizeArray = { -1.0f, -1.0f, -1.0f };
assertTrue (Arrays.equals(SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE, new SFVec3fObject().setValueByString(SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE_STRING).getPrimitiveValue()),
"test DEFAULT_VALUE_BBOXSIZETYPE matches DEFAULT_VALUE_BBOXSIZETYPE_STRING for this field object");
SFVec3fObject testSFVec3fBboxSizeObject = new SFVec3fObject(SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE); // static initializer is tested, might throw exception
assertTrue (testSFVec3fBboxSizeObject.matches(), "testSFVec3fBboxSizeObject.matches() tests object initialization correctly matches regex");
assertTrue (Arrays.equals(defaultBboxSizeArray, SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE), "test correct default value for this field object");
assertTrue (SFVec3fObject.matches(SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE_STRING),
"SFVec3fObject.matches(SFVec3fObject.DEFAULT_VALUE_BBOXSIZETYPE_STRING) tests object initialization correctly matches regex");
assertFalse (testSFVec3fBboxSizeObject.isDefaultValue(),"test initialized field object isDefaultValue() returns true");
assertTrue (!SFVec3fObject.REGEX_BBOXSIZETYPE.contains("^") && !SFVec3fObject.REGEX_BBOXSIZETYPE.contains("$"), "test SFVec3fObject.REGEX does not contain anchor characters ^ or $");
// avoid unexpected equivalent regexes
assertFalse (SFVec3fObject.REGEX.equals(SFVec3fObject.REGEX_BBOXSIZETYPE), "test SFVec3fObject.REGEX.equals(SFVec3fObject.REGEX_BBOXSIZETYPE) returns false");
testSFVec3fBboxSizeObject.setValue(-1.0f, -1.0f, -1.0f);
assertTrue (Arrays.equals(defaultBboxSizeArray, testSFVec3fBboxSizeObject.getPrimitiveValue()), "tests setting object value to 0.0f 0.0f 0.0f results in singleton array with same value");
testSFVec3fBboxSizeObject.setValue(defaultBboxSizeArray); // returns void because it matches (overrides) Java SAI specification interface
assertEquals(defaultBboxSizeArray,testSFVec3fBboxSizeObject.getPrimitiveValue(), "tests setting object value to default-value array results in equivalent getPrimitiveValue()");
assertFalse (SFVec3fObject.matches (""), "tests empty string \"\" fails SFVec3fObject.matches(value), illegal value");
assertTrue (SFVec3fObject.matchesBboxSizeType(""), "tests empty string \"\" passes SFVec3fObject.matchesBboxSizeType(value), legal value");
assertFalse (testSFVec3fBboxSizeObject.setValue ( -2.0f, -2.0f, -2.0f ).matchesBboxSizeType(), "tests setting object value to -2.0f -2.0f -2.0f fails");
assertFalse (testSFVec3fBboxSizeObject.setValueByString("-2.0 -2.0 -2.0" ).matchesBboxSizeType(), "tests setting object value to \"-2.0 -2.0 -2.0\" fails");
assertFalse (testSFVec3fBboxSizeObject.setValue ( -2.0f, -2.0f, -2.0f ).matchesBboxSizeType(), "tests setting object value to -2.0f -2.0f -2.0f fails");
assertTrue (SFVec3fObject.matches ("-2.0 -2.0 -2.0"), "tests \"-2.0 -2.0 -2.0\" passes SFVec3fObject.matches(value)");
assertFalse (SFVec3fObject.matchesBboxSizeType("-2.0 -2.0 -2.0"), "tests \"-2.0 -2.0 -2.0\" fails SFVec3fObject.matchesBboxSizeType(value)");
assertTrue (SFVec3fObject.matches (" 2.0 2.0 2.0"), "tests \" 2.0 2.0 2.0\" passes SFVec3fObject.matches(value)");
assertTrue (SFVec3fObject.matchesBboxSizeType(" 2.0 2.0 2.0"), "tests \" 2.0 2.0 2.0\" passes SFVec3fObject.matchesBboxSizeType(value)");
assertTrue (SFVec3fObject.matchesBboxSizeType(" 0.0 0.0 0.0"), "tests \" 0.0 0.0 0.0\" passes SFVec3fObject.matchesBboxSizeType(value)");
assertFalse (SFVec3fObject.matchesBboxSizeType(" 0.0 0.0 0.0 0.0"), "tests \" 0.0 0.0 0.0 0.0\" fails SFVec3fObject.matchesBboxSizeType(value), too many values");
assertFalse (SFVec3fObject.matchesBboxSizeType(" 0.0 0.0"), "tests \" 0.0 0.0\" fails SFVec3fObject.matchesBboxSizeType(value), insufficient values");
}
All working. 8)
So yes, we can add regexes for various types, and perhaps even an alternate pattern for a base type if it makes sense. This shows one way.
Thanks for SFImage considerations, will work on these another day.
> SFImage regex considerations:
>
> Are hex values allowed for width etc. ? The spec. only says integer but perhaps decimal is implied.
>
> Can hex. values use any case for letters ? The descriptions only use capitals as 0xFF. Is 0Xff also valid ?
>
> 0 0 0 is the default value. But should 0 be allowed for components in any other case ?
>
> No leading 0s for integer values.
>
> But 0x0000FF is OK.
>
> No negative integers for width etc.
>
> Perhaps allow negative values for color: 255 = -1 for single component, for example.
>
>
>
> -- AP on the road
>
>
> On Fri, Aug 17, 2018, 3:15 PM Don Brutzman <brutzman at nps.edu <mailto:brutzman at nps.edu>> wrote:
>
> details follow on current regex progress.
>
> On 8/7/2018 8:21 AM, Don Brutzman wrote:
> > [...]
> > http://www.web3d.org/specifications/X3dRegularExpressions.html#Design
> > http://www.web3d.org/specifications/X3dRegularExpressions.png
> >
> > Continuing:
> >
> > On 8/6/2018 11:52 AM, Andreas Plesch wrote:
> >> Two quick points:
> >>
> >> The hexadecimal pattern got corrupted. SFImage uses this pattern:
> >>
> >> 0x([a-f]|[A-F]|\d]){1,8}
> >>
> >> where \d is the digit character class.
> >
> > Thanks, will work on that pattern next.
>
> Currently using 0x[0-9a-fA-F]{1,8} for hexadecimal value, from Regular Expressions Cookbook.
>
> Preliminary pattern for SFImage is there, but also need to allow integers as alternatives for hex values.
>
> >> There are two equivalent +- patterns
> >> (\+|\-)?
> >> [+|-]?
> >> Probably only one should be recommended and used subsequently.
> >
> > Good catch, thank you. Will scrutinize and normalize.
>
> [+-]? for sign and [Ee] for scientific-notation exponent.
>
> >> I have no good idea how to detect a leading 0 0 0 as non-matching. My
> >> not so good idea would be to explicitly allow
> >> non-0 followed by x x or
> >> x followed by non-0 followed by x or
> >> x x followed by non-0
> >>
> >> where non-0 is something [+-]?0?\.?[0-9]*[1-9]+[0-9]*([e|E][+-]?[0-9]+)*
> >> and x is a floating point number
> >>
> >> -Andreas
> >
> > Possibly, but am thinking it adds serious complexity that is hard to maintain. So probably not but will think about it.
>
> It is great when we find those patterns, am trying to keep things maintainable. The color-coding on web page definitely helps.
>
> Negative lookahead is good for avoiding illegal values, but not allowed in XML Schema (probably to avoid computational denial of service attacks_. Have listed a pattern nevertheless, other languages/tools might want to use it.
>
> ==================
> Disallowed values:
>
> * Negative lookahead filters can disqualify attributes that contain illegal values.
> * W3C Recommendation for XML Schema (XSD) unfortunately does not support this construct.
> * (?!((0|0.0*|.0+)\s+){3}) prohibits 0 0 0 since zero vector is illegal as initial axis triplet of an SFRotation.
> * (?!\s*,\s*,) prohibits multiple adjacent commas in intermediate whitespace, for example 0 0 0, ,0 0 0 is an illegal set of MFColor values.
> ==================
>
> > Thanks for the continuing review! Very helpful.
> Steadily better and better... need to check each regex101 unit test to make sure properly updated. SF/MFImage are next, then matrix types.
all the best, Don
--
Don Brutzman Naval Postgraduate School, Code USW/Br brutzman at nps.edu
Watkins 270, MOVES Institute, Monterey CA 93943-5000 USA +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman
More information about the x3d-public
mailing list