[x3d-public] SFImage regex

Andreas Plesch andreasplesch at gmail.com
Fri Jun 8 07:44:39 PDT 2018


On Fri, Jun 8, 2018, 2:28 AM Don Brutzman <brutzman at nps.edu> wrote:

> Lots of great questions.  Apologies for not answering as best possible, am
> traveling.
>
> a. Haven't included leading or trailing whitespace since examples all get
> canonicalized first.
>
> Also the X3D XML Schema constructs all include <xs:whiteSpace
> value="collapse"/> for validation
>

It would be then important to point out on the regex page that the regexes
only apply after trimming whitespace and canonicalization.

I agree with Leonard that the regexes should only reflect what is in the
spec. directly, for general application. In addition it looks like there is
a need for a set of strict patterns which deal with multiple commas, and
leading zeroes and such.


> b. We should also try to come up with a subpattern that handles
> one-and-only-one comma amidst other optional whitespace, avoiding
> catastrophic backtracking.  The X3D and VRML encodings say to treat them as
> whitespace.  The XML schema honors that but carefully - not allowing excess
> commas to go unnoticed since historically is the only indicator that a long
> line of values has gotten corrupted.
>

(,|\s+) would be such a pattern. However, excess commas may be legal albeit
ugly.


> c. the differences of regex syntax among various languages are relatively
> minor and explained at length in lots of documentation... we should be able
> to avoid idiosyncrasies without too much trouble (i think that is the case
> already).
>

It is the case, largely, except for the implicit anchoring in XML. See how
Leonard added ^ and $ anchors.


> d. looked online and found plausible exemplars om stackoverflow; have
> added regex patterns for SF/MFBool, SFInt32, SFFloat/Double/Time.  comments
> in x3d v4 schema show links.
>

Remember that SFInt32 allows hexadecimal.

I will look into the 0 0 0 SFImage case. It works outside of XML, so the
failure in XML Spy may be due to the $.
https://www.w3.org/TR/xmlschema11-2/#regexs does not seem to define the $
anchor, as a quick search for '$' indicates. This is unfortunate.

Andreas


> e. have copied over from XML Schema and added regex patterns to X3DUOM for
> each field type.  example:
>
> <FieldType type="SFImage"
>         regex="[ \t]*(([0-9]|[1-9][0-9]+)([
> ]+|$)){3}(([0-9]|([1-9][0-9]+)|(0x([0-9]|[a-f]|[A-F])+))([ ]+|$))*">
>     <InterfaceDefinition specificationUrl="
> http://www.web3d.org/documents/specifications/19775-1/V3.3/Part01/fieldsDef.html#SFImageAndMFImage
> "
>         appinfo="The SFImage field specifies a single uncompressed
> 2-dimensional pixel image. SFImage fields contain three integers
> representing the width, height and number of components in the image,
> followed by (width x height) hexadecimal or integer values representing the
> pixels in the image."/>
> </FieldType>
>
> f. X3DJSAIL support: have used this X3DUOM information further and added
> regex values and regex testing to field type objects.  this should
> facilitate experimentation and everyday usage even further.
>
> http://www.web3d.org/specifications/java/X3DJSAIL.html#regex
>
> "Regular expression (regex) support in field types, for example
> SFVec3fObject REGEX string, pattern, matches() and matches(String value)."
>
>
> http://www.web3d.org/specifications/java/javadoc/org/web3d/x3d/jsail/fields/SFVec3fObject.html#REGEX
>
> http://www.web3d.org/specifications/java/javadoc/org/web3d/x3d/jsail/fields/SFVec3fObject.html#pattern
>
> http://www.web3d.org/specifications/java/javadoc/org/web3d/x3d/jsail/fields/SFVec3fObject.html#matches--
>
> http://www.web3d.org/specifications/java/javadoc/org/web3d/x3d/jsail/fields/SFVec3fObject.html#matches-java.lang.String-
>
> Tested in HelloWorldProgram.java satisfactorily:
>
> <!-- SFVec3f default=0 0 0, initial=1 2 3, setValue=4 5 6, multiply(2)=8
> 10 12, normalize()=0.45584232 0.5698029 0.68376344, regex matches()=true -->
> <!-- regex SFVec3f().matches("1 2 3")=true, regex SFVec3f().matches("1 2 3
> 4")=false -->
>
> Having fun with X3D Regexes!
>
> http://www.web3d.org/specifications/X3dRegularExpressions.html
>
> On 6/7/2018 2:27 PM, Andreas Plesch wrote:
> > Apologies for using this thread to keep a record since this is getting
> technical.
> >
> > The XML encoding for SFImage
> >
> >
> http://www.web3d.org/documents/specifications/19776-1/V3.3/Part01/EncodingOfFields.html#SFImage
> >
> > mentions whitespace as separator for pixel values. So that would include
> any kind of whitespace, and perhaps repeated whitespace.
> >
> > Looking at XML, it has its own regex definition including character
> classes:
> >
> > https://www.w3.org/TR/xmlschema11-2/#cces
> >
> > 4.2.5 lists popular classes including \d for decimal digits and \s for
> common whitespace. So it should be possible to use those as they are wildly
> recognized outside of XML as well.
> >
> > XML regexes also are anchored implicitly at the start and end, meaning
> there are no partial matches. Since this is unusual outside of XML, it
> probably should be mentioned somewhere on the x3d regex page. This is
> especially important if the regexes are intended to be used for other
> encodings such as VRML as well.
> >
> > -Andreas
> >
> > On Thu, Jun 7, 2018 at 4:55 PM Andreas Plesch <andreasplesch at gmail.com
> <mailto:andreasplesch at gmail.com>> wrote:
> >
> >     Two more observations which may be worth while being stated
> explicitly:
> >
> >     The regexes are expected to be used just against attribute strings,
> not the complete element xml, or scene xml . I think that is implied by how
> XML native data types are referenced.
> >     Partial matches do not count as successful. That means, there needs
> to be an additional check if the matched portion of the string is identical
> to the string. I think that is implied how the existing regexes are
> formulated.
> >
> >     And two more question:
> >
> >     The existing regexes do not allow for leading white space. It looks
> like this is inspired by XML spec. regexes:
> https://www.w3.org/TR/xmlschema11-2/#decimal . However, native XML
> decimal integers and floats allow leading white space due to the fixed
> whiteSpace: collapse restriction.
> >     Should therefore optional leading white space be added to the
> existing regexes ? I think so, or alternatively removed from native using
> types (by not using native types).
> >     For SF fields which contain multiple numbers, such as SFVec or
> SFColor, the existing regexes require exactly one space character as
> separator. What is the rationale for not allowing repeated space characters
> which may help with formatting ?
> >
> >     -Andreas
> >
> >     On Thu, Jun 7, 2018 at 2:11 PM Andreas Plesch <
> andreasplesch at gmail.com <mailto:andreasplesch at gmail.com>> wrote:
> >
> >         Since I got started with the regexes, let's look at SFImage as
> it is still a TODO.
> >
> >
> http://www.web3d.org/documents/specifications/19775-1/V3.3/Part01/fieldsDef.html#SFImageAndMFImage
> >
> >         appears to be the main (only?) source of the format description.
> >
> >         We need exactly three decimal non-negative integers followed by
> a zero or more non-negative decimal or hexadecimal integers.
> >
> >         There is no mentioning of the separator, so let's look at some
> example scenes:
> >
> >
> http://www.web3d.org/x3d/content/examples/ConformanceNist/Appearance/PixelTexture/index.html
> >
> >         The NIST examples all use space as separator.
> >         http://www.web3d.org/x3d/tooltips/X3dTooltips.html#PixelTexture
> examples all use space.
> >
> >
> http://www.web3d.org/specifications/X3dSchemaDocumentation4.0/x3d-4.0_SFImage.html
> has a minimum length of 5, presumably as a result of three single digits
> for wdth, height and components plus two separator characters.
> >
> >         So one question is: Are single commas legal to separate numbers
> in SFImage from each other ?
> http://www.web3d.org/specifications/X3dRegularExpressions.html says
> commas are only allowed in MF fields, so let's say the answer is no.
> >
> >         What about leading zeroes ? The general guidance in
> http://www.web3d.org/specifications/X3dRegularExpressions.html does not
> allow leading zeroes.
> >
> >         There is a requirement to have width x height x component number
> of pixel values but I am not sure if this requirement can be (easily)
> checked by a regex.
> >
> >         Capital letters:
> http://www.web3d.org/documents/specifications/19775-1/V3.3/Part01/fieldsDef.html#SFImageAndMFImage
> only uses capital A-F in hexadecimal example values but let's say a-f is
> also allowed since they are used in almost all examples.
> >
> >         Ok, let's give it a try:
> >
> >         [ \t]*(([0-9]|[1-9][0-9]+)([
> ]+|$)){3}(([0-9]|([1-9][0-9]+)|(0x([0-9]|[a-f]|[A-F])+))([ ]+|$))*
> >
> >         In words:
> >
> >         match any leading white space followed by
> >         exactly three times the following
> >           - one of the following
> >             - either a single digit (including 0) or
> >             - a two or more digit number starting with a 1 to 9 digit
> >           - followed by either
> >            - one or more spaces
> >            - or the end of the string (accommodating the default '0 0 0'
> case)
> >         then optionally followed zero or more times by
> >            - one of the following
> >              - either a single digit (including 0) or
> >              - a two or more digit number starting with a 1 to 9 digit
> >              - 0x followed by at least one of
> >                 - a single digit or
> >                 - a to f letter or
> >                 - A to F letter
> >            - followed by either
> >              - one or more spaces
> >              - or the end of the string
> >         The spaces are written as [ ] for clarity but could be just a
> space character.
> >
> >         This allows 0 0 0 as it is the default value. It also allows 1 2
> 3 without any pixel values.but we do not aim at checking the number of
> pixel values anyways.
> >
> >         Any input, in particular examples of problem cases welcome.
> Since images can be huge it may be necessary to optimize for performance
> and memory which may require more regex expertise than I can bring myself
> to acquire.
> >
> >         -Andreas
> >
> >         --
> >         Andreas Plesch
> >         Waltham, MA 02453
> >
> >
> >
> >     --
> >     Andreas Plesch
> >     Waltham, MA 02453
> >
> >
> >
> > --
> > Andreas Plesch
> > Waltham, MA 02453
> >
> >
> > _______________________________________________
> > x3d-public mailing list
> > x3d-public at web3d.org
> > http://web3d.org/mailman/listinfo/x3d-public_web3d.org
> >
>
>
> all the best, Don
> --
> Don Brutzman  Naval Postgraduate School, Code USW/Br
> brutzman at nps.edu
> Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
> X3D graphics, virtual worlds, navy robotics
> http://faculty.nps.edu/brutzman
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20180608/d007ff08/attachment-0001.html>


More information about the x3d-public mailing list