[x3d-public] Major stumbling block in X3DJSAIL. NMTOKEN regexp

Don Brutzman brutzman at nps.edu
Thu Jan 26 12:06:25 PST 2017


Rechecked: current NMTOKEN regex in Java code comes from the stylesheet, not the X3D Schema.  Typical insertion:

	!newValue.matches("[a-zA-Z_][a-zA-Z_0-9]*")) // NMTOKEN character regex check

An older XML Schema Recommendation has the following:

	3.2.6 NMTOKEN
	https://www.w3.org/1999/05/06-xmlschema-2/#NMTOKEN
	Issue (nmtoken-primitive-or-generated): should NMTOKEN be defined as a primitive (as above) or as a subtype of [string] with a regular expression facet such as "[a-zA-Z0-9_-]+" (or whatever the regular expression actually should be to match the Nmtoken production)?

but the current REC omits any specific regexes for each type.

	3.3.4 NMTOKEN
	https://www.w3.org/TR/xmlschema-2/#NMTOKEN
	"[Definition:]   NMTOKEN represents the NMTOKEN attribute type from [XML 1.0 (Second Edition)]. The ·value space· of NMTOKEN is the set of tokens that ·match· the Nmtoken production in [XML 1.0 (Second Edition)]. The ·lexical space· of NMTOKEN is the set of strings that ·match· the Nmtoken production in [XML 1.0 (Second Edition)]. The ·base type· of NMTOKEN is token."

continuing there:

===============================
2.3 Common Syntactic Constructs

[...]
An Nmtoken (name token) is any mixture of name characters.

[Definition: A Name is an Nmtoken with a restricted set of initial characters.] Disallowed initial characters for Names include digits, diacritics, the full stop and the hyphen.

[4]   	NameStartChar ::=   	":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a]   	NameChar   ::=   	NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5]   	Name	   ::=   	NameStartChar (NameChar)*
[6]   	Names	   ::=   	Name (#x20 Name)*
[7]   	Nmtoken	   ::=   	(NameChar)+
[8]   	Nmtokens   ::=   	Nmtoken (#x20 Nmtoken)*
===============================

Noting that "-" | "." are allowed after the start character, and avoiding entity characters for now, an improved Java construct is

	!newValue.matches("[a-zA-Z_][a-zA-Z0-9_.-]*")) // NMTOKEN character regex check

successful test case in HelloWorldProgram.java:

	box.setSize(boxSize).setCssClass("textured").setDEF("test-NMTOKEN_regex.0123456789");

Changes checked in and deployed, available online.

TODO remains, future work on this and all X3D Schema regular expressions for completeness/correctness.

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman



More information about the x3d-public mailing list