[x3d-public] Major stumbling block in X3DJSAIL. NMTOKEN regexp
Don Brutzman
brutzman at nps.edu
Thu Jan 26 12:06:25 PST 2017
Rechecked: current NMTOKEN regex in Java code comes from the stylesheet, not the X3D Schema. Typical insertion:
!newValue.matches("[a-zA-Z_][a-zA-Z_0-9]*")) // NMTOKEN character regex check
An older XML Schema Recommendation has the following:
3.2.6 NMTOKEN
https://www.w3.org/1999/05/06-xmlschema-2/#NMTOKEN
Issue (nmtoken-primitive-or-generated): should NMTOKEN be defined as a primitive (as above) or as a subtype of [string] with a regular expression facet such as "[a-zA-Z0-9_-]+" (or whatever the regular expression actually should be to match the Nmtoken production)?
but the current REC omits any specific regexes for each type.
3.3.4 NMTOKEN
https://www.w3.org/TR/xmlschema-2/#NMTOKEN
"[Definition:] NMTOKEN represents the NMTOKEN attribute type from [XML 1.0 (Second Edition)]. The ·value space· of NMTOKEN is the set of tokens that ·match· the Nmtoken production in [XML 1.0 (Second Edition)]. The ·lexical space· of NMTOKEN is the set of strings that ·match· the Nmtoken production in [XML 1.0 (Second Edition)]. The ·base type· of NMTOKEN is token."
continuing there:
===============================
2.3 Common Syntactic Constructs
[...]
An Nmtoken (name token) is any mixture of name characters.
[Definition: A Name is an Nmtoken with a restricted set of initial characters.] Disallowed initial characters for Names include digits, diacritics, the full stop and the hyphen.
[4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a] NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5] Name ::= NameStartChar (NameChar)*
[6] Names ::= Name (#x20 Name)*
[7] Nmtoken ::= (NameChar)+
[8] Nmtokens ::= Nmtoken (#x20 Nmtoken)*
===============================
Noting that "-" | "." are allowed after the start character, and avoiding entity characters for now, an improved Java construct is
!newValue.matches("[a-zA-Z_][a-zA-Z0-9_.-]*")) // NMTOKEN character regex check
successful test case in HelloWorldProgram.java:
box.setSize(boxSize).setCssClass("textured").setDEF("test-NMTOKEN_regex.0123456789");
Changes checked in and deployed, available online.
TODO remains, future work on this and all X3D Schema regular expressions for completeness/correctness.
all the best, Don
--
Don Brutzman Naval Postgraduate School, Code USW/Br brutzman at nps.edu
Watkins 270, MOVES Institute, Monterey CA 93943-5000 USA +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman
More information about the x3d-public
mailing list