[x3d-public] xml:lang language identification, X3D metadata patterns for XMP and multilingual support

vmarchetti at kshell.com vmarchetti at kshell.com
Thu Nov 25 18:25:11 PST 2021



> On Nov 25, 2021, at 11:00 AM, Brutzman, Donald (Don) (CIV) <brutzman at nps.edu> wrote:
> 
> Thanks for your insights and helpful summary Jakub.
>  
> With respect to xml:lang support in X3D XML encoding, am looking at ways we might effectively support it in validation tools and documentation as an allowed XML attribute that is not a preferred approach, emphasizing declarative MetadataString representations instead.  Have started variation testing on an example scene.
> 
> With regard to mapping XMP structures (essentially embedding Semantic Web RDF declarations), your suggestion to look at all of the defined terms & relationships there for correspondences to Dublin Core and other known terms is a good one.  Thanks for sharing the XMP specification reference so everyone can look at it (I believe that XMP was later codified as an ISO standard too). This will also help us in future work when mapping glTF JSON-LD (linked-data) constructs, if those folks are actually following such practices.
>  
> With regard to changing X3D4 XML default containerField for Metadata* nodes as ‘value’ for much terser Metadata structures, initial tests look good.  Am considering how to best add a corresponding X3D3 XML example that clearly shows necessary compatible verbose forms for X3D versions 3.0 through 3.3.  In combination with good diagnostics and documentation, this should give us a solid path forward for backward/forward compatibility without problems.
>  
> More on language representation: here is an interesting excerpt from  RFC 5646.
> RFC 5646, Tags for Identifying Languages, Best Current Practice BCP 47
> https://datatracker.ietf.org/doc/html/rfc5646 <https://datatracker.ietf.org/doc/html/rfc5646>
>       For markup languages, such as HTML and XML, language information
>       can be added to each part of the document identified by the markup
>       structure (including the whole document itself).  For example, one
>       could write <span lang="fr">C'est la vie.</span> inside a German
>       document; the German-speaking user could then access a French-
>       German dictionary to find out what the marked section meant.  If
>       the user were listening to that document through a speech
>       synthesis interface, this formation could be used to signal the
>       synthesizer to appropriately apply French text-to-speech
>       pronunciation rules to that span of text, instead of applying the
>       inappropriate German rules.
>  
> Am wondering if we can extend this pattern to allow defining compatible multilingual expressions, as often occurs in XML and Semantic Web.  Perhaps something like this:
>  
> <MetadataSet name='description' reference='https://datatracker.ietf.org/doc/html/rfc5646 <https://datatracker.ietf.org/doc/html/rfc5646>'>
>      <MetadataString name='description' value='Hello World'      reference='lang=EN'/>
>      <MetadataString name='description' value='Witaj świecie'    reference='lang=PL'/>
>      <MetadataString name='description' value='Bonjour le monde' reference='lang=FR'/>
>      <MetadataString name='description' value='Hola Mundo'          reference='lang=ES'/>
> </MetadataSet>
>  
> Nesting the 'lang=??' construct as an overload on the reference field is a bit awkward, and also limits proper use of the reference field (instead of pointing to a semantic reference as intended).  Possible simpler alternative follows, which looks appealing and would also facilitate X3D parsing/conversions to various programming languages and file encodings:
>  
> <MetadataSet name='description' reference='https://datatracker.ietf.org/doc/html/rfc5646 <https://datatracker.ietf.org/doc/html/rfc5646>'>
>      <MetadataString name='description' value='Hello World'      lang='EN'/>
>      <MetadataString name='description' value='Witaj świecie'    lang='PL'/>
>      <MetadataString name='description' value='Bonjour le monde' lang='FR'/>
>      <MetadataString name='description' value='Hola Mundo'          lang='ES'/>
> </MetadataSet>
>  
> Since multilingual support is already a goal for X3D (but we only support one language at a time), this second approach is worth considering.  In addition to usage in metadata structures like XMP, it seems useful across the rest of X3D as well. For example an author might put such a construct within an Anchor or TouchSensor or Text node, and an X3D player might then offer the corresponding description matching user (or HTML page, or Web browser) language preferences.
>  
> Note that it would require an additional field for MetadataString nodes.  Candidate specification addition follows:
> https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-CD1/Part01/components/core.html#MetadataString <https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-CD1/Part01/components/core.html#MetadataString>
> 7.4.6 MetadataString
> MetadataString : X3DNode, X3DMetadataObject {
>   SFNode   [in,out] metadata  NULL [X3DMetadataObject]
>   SFString [in,out] name      ""
>   SFString [in,out] reference ""
>   MFString [in,out] value     []
>   SFString [in,out] lang      ""
> }
> The lang field identifies corresponding human language for the provided value strings in accordance with [RFC5646].
>  
> Since this additional lang field is only suggested for X3D4 MetadataString node, not seeing any negative impact on existing X3D content or implementations.
>  
> Opinions please?

I offer the judgement that this X3D MetadataSet:

<MetadataSet name='description' reference='https://datatracker.ietf.org/doc/html/rfc5646'>
     <MetadataString name='description' value='Hello World' >
        <MetadataString containerField='metadata' name='lang' value='EN' reference='http://www.w3.org/XML/1998/namespace'/>
     </MetadataString>
     <MetadataString name='description' value='Witaj świecie'>
        <MetadataString containerField='metadata' name='lang' value='PL' reference='http://www.w3.org/XML/1998/namespace'/>
     </MetadataString>
     <MetadataString name='description' value='Bonjour le monde'> 
        <MetadataString containerField='metadata' name='lang' value='FR' reference='http://www.w3.org/XML/1998/namespace'/>
     </MetadataString>
     <MetadataString name='description' value='Hola Mundo'>
        <MetadataString containerField='metadata' name='lang' value='ES' reference='http://www.w3.org/XML/1998/namespace'/>
     </MetadataString>
</MetadataSet>

satisfies the same purpose as the the one above using the proposed field name lang, without the need to change the the X3D standard. My opinion is against adding a lang field to the MetadataString node.


Vince Marchetti


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20211125/b9c28b7a/attachment-0001.html>


More information about the x3d-public mailing list