[X3D-Public] x3d html5
Joe D Williams
joedwil at earthlink.net
Wed Jun 2 19:07:09 PDT 2010
With the current interest in HTML5 and forward as a 'native' home for
WWW3D, I predict X3D will someday include a profile of some same and
some revised nodes (elements) and an alternative event system aimed at
taking advantage of the 'free' DOM provided by consolidation. This is
demonstrable as providing a great XML-based authortime environment,
and will work with OGL and WebGL bindings, and also with those hosts
that may actually integrate advanced higher perfomance nD+1 scenegraph
and eventgraph rendering context than WebGL connections can provide.
The idea is to work with the W3C and produce an open standard, usable
is HTML5, for platform-agnostic 3D models.
To extend HTML I think we are showing the idea that basic X3D
syntax/structure is acceptable and can be consolidated with other html
syntax. This user code integration means we see code like <x3d> ....
</x3d> used in a document that includes <p> ...</p> and <svg> ...
</svg> and other html5 and it works as if highly integrated, even
seamless, like any other set of related elements in the document.
Authorship and delivery for this already works with extensible
features using more or less basic XML tools at X3DOM.org. While it is
a big step to consider freeDOM as part of the current html standard
(or of the X3D standard), it is representing a standards-track
approach. The basic package needs development, and wide availability
of an open, optimized runtime that covers a certain profile of the
entire X3D node and feature set, as well as some connections to
related html features like CSS.
To support X3D in HTML, there has to be a system for clear validation
of the syntax and content models, further supported by multiple free
and open authortime and runtime implementations. This means
standards-track XML schema and XML editors and processors, from which
input for text/html and application/xhtml+xml processors can be
Behind and alongside that, there needs to be international agreement
on the names and types of certain data structures and for certain
interrelationships between those structures to be standardized. The
path between those best current practices and, hopefully, best basis
for future practices, are open and free implementations that support
authoring and delivery of the interactive pixels that can get produced
by authors and consumed by others.
The great opportunity is to define this relationship of the existing
X3D open standard for platform-agnostic 3D models with actual user
code and using the DOM produced by an HTML5 processor.
My experiences still seem to point to the idea that finally,
especially when we add in XML, it is the data that is agnostic.
Proprietary, controlled, and closed non-text solutions for common best
practices, and encumbered solutions provide the platform restrictions
(nosticisms?) that limit reuse between platforms. That is, it is all
numbers and characters, arranged to represent live 1D, 2D, 3D, 4D, and
yes, even nD+1 manifestations.
Authortime: To be fun, these numbers and characters must be arrangable
and paramaterizable in some authortime, then enlivened with processing
by some runtime that accepts the data and produces interactive
multiple media. To feed this runtime, we want authortime to arrange
the data into semantic containers that include the idea of readabiity
by interested humans. After all, most else except screen shots or
videos are transient; only the actual visible, comprehensible,
leveragable, authortime user code that is readable and tweakable using
the most simple utf-n text editor really matters.
Needless to say the optimized runtime structures are pure and can only
be damaged by faulty user code generated in authortime. That damage
happens when the basic containers for the data are not named
consistently, or when the data is in a different architecture than is
expected by the loader that transmutes the user code into structures
traversed and updated by the runtime. Today, the syntax and structure
of the basic data matters even less, with many tools capable of
automated programatic transcoding between forms, but the problem
remains: What data structures and semantics to start with when you
wish to deliver realtime interactive nD running freely on our WWW?
Given there must be a concise binary delivery form that provides
content security and minimum network bw, but let us start with the
basic stuff for 3D which is a set of x,y,z coordinates for each vertex
comprising a shape. There are more than a few ways to declare this in
authortime, and there should be, but finally, the runtime seems to
need a data set that actually has one x,y,z coordinate for each vertex
of each triangle of each shape that can be perceived and interacted
with. (For each rendering, the runtime shines virtual lights on the
virtual triangles that make up the virtual shapes and then figures out
how the next frame can be represented to the interactor.)
Sets of coordinates that make up shapes can be containerized in many
different ways. And, each of these coordinate points can also include
other related information, like color, texture, physics, shininess,
transparency, and so on depending upon your time and budget and your
needs for realistic rez.
Fortunately, as a foundation for this there is the GL, our venerable,
modernized OGL, to provide the anchor that shows us requirements and
best practice for organizing and processing this data to provide
accelerated rendering. Now, after 'shaders' scripting is added, things
are different but the same and still the data structure processed by
the runtime mostly looks at the same stuff (vertex lists comprising
triangles that reflect, absorb, or transmit light) at some crucial
point in its realization.
One form used by the GL to work with the data and upward to
abstractions that are authorable and basically readable in ordinary
XML text form are shown in the vocabulary and data structures
supported by X3D. Here, we call a shape a Shape and provide
connections to the several related data structures that represent
appearance and animate the shape. It is important to recognize the X3D
set of abstractions as providing semantically named paths to carry
data between authortime and runtime.
Runtime: For high fidelity nD+1 this data is arranged into a graph
structure with a hierarchy that helps the runtime gracefully update,
traverse, and render the vizualization. For example, if you are
building a humanoid character, you want to have the data arranged
conveniently in order that when you moved a parent shoulder joint,
then the child arm, hand, and finger segments acted like they are all
attached - just like you would expect a hierarchy of associated groups
of objects to respond in real life.
When looking at how to mark up this mostly common data for authorship
using ISO standard names for the data and giving content models for a
relatively author-friendly text-based system to define the geometry,
lighting, navigation, animation, interactivity, and networking for
consumption by a context that understands an open, well-defined and
proven scenegraph (call it a DAG), please look at X3D.
Event Flow: In addition to a realistic heirarchy for related shapes
and interactions within a scene, we also need an event graph that
allows process flow between different scene elements (call them
nodes), and even to the rest of creation outside the context of
current interest. In my experience, it is likewise to events and to
data. That is, realistically, there is really only one type of event
that is of interest and that is an event arising on one node that
exchanges meaningful data with (an)other node(s).
Sure, the events passed directly between nodes and the runtime and
even the old in/out with the great ever-expanding External are fun,
but for now just think of one node generating or passing information
to another node (or call them elements). It is even more fun if one or
more of those nodes is an active emmitter of some energy, like
temporality, heat, touch, gravity, or whatever
environmental/semantic/empathic event that adds information to the
So, even as the event basics are known, there turns out to be several
ways to do it. One example is the structures and event systems of the
HTML/XML DOM. Needless to say there are many example event systems for
high-tech 3D, since it is not in the foundation of our beloved
standard, the GL, or in the important ecmascript graphics connection,
the WebGL. Maybe more of one in O3D? However, given the tools
supporting WebGL, a tree/graph traversal and event system can be
derived to produce interactivity. For ISO basics in constructing a
model that can efficiently produce realistic realtime interactions and
detailed realistic simulations, please look at the X3D scene access
interface event graph and data flow sequencing.
The other item important about realtime 3D is that it is actually
meant to be real time. That is, the next frame of animation is
rendered as soon as possible, not at some fixed frame per second
rendering rate. In a typical animation, when the current frame is
complete and ready to be delivered, we have an opportunity to begin
accepting and processing new events and begin computing what the next
frame will be. If the scene is created using a 'single' WebGL script
that computes or traverses data structures to produce animation of
x,y,z and time, then that script runs as often as it can, or else
simulates realtime by refreshing at some target interval.
That leads to the next big step in abstraction. How to extend the GL
and the WebGL scenegraph computation facilities by plugging in some
proven high performance compilations of some very practical computing
assets? Please start with ISO X3D nD interpolators and environmental
spacetime sensors. Stuff that you soon really want and need, but gets
very expensive to do even in accelerated parallelized ecmascript.
Again, a choice of what models to use when considering free and open
extensions should begin with examination of best open practices of
what is actually implemented as free and open technology in industry
and hobby applications of basic and realistic simulations and
The Cascade: An important and expansive impulse change in authortime
and runtime complexity happens when the idea of an event cascade is
introduced. This is when some event produces a change in several nodes
as events propagate in preparation for the next frame. Importantly,
authors may wish to be sure that sets of events are accomplished in
some predictable sequence, maybe in relation to other events that may
be forming content for the next and future frames.
It gets complicated fast with sensors and physics so we need an
appropriate system of data and events that gives a fair mix between
authoritme and runtime conveniences. This is needed when inspired
authoring requires a step back from moving the bits around to allow
focus on modeling humanistic content interactions between virtual
objects and fields rather than detailed ecmascript manipulations -
until you really need the script, then you really need it. This also
implies strong high-level prototyping facility and the ability to
import or remotely employ and interact with a wide range of external
Producing a standard model for realtime interactive local and
distributed scene and event graphs is currently under long-term
development. The current ISO X3D offering uses the following model as
a platform-agnostic way of defining what happens when we get some
signal (internal/external event) that all is not static and we need
some new transactions in order to prepare the next frame.
Even though the order in which the events in a particular cascade are
applied is not considered significant, when an initial event is
propagated, other nodes may respond by generating event(s). This
continues until all listening nodes are honored then all are realized.
This process is called an event cascade. All events generated during a
given cascade are treated as having the same timestamp as the initial
event, and all are considered to happen instantaneously. Actually,
this little detail of the 'instantaneous' event cascade execution is
what gives us the ability to virtualize reality instant by instant
Within this consideration, when the X3D runtime creates a frame it
must operate as if it follows some rules. First, the runtime updates
the camera position and orientation, if required, then, if an initial
event has happened, evaluates the current event cascade. The initial
event may come from animations in the scene when the runtime asks if
time or other features need update. If there are no other initial
events for this frame, then produce the frame.
Or, if there are more initial events that should be considered to
produce the frame, evaluate the next cascade. These steps are meant to
give the author and the runtime some versatility but provide an
authoring path that provides repeatable operations by eliminating
leakage of results of one cascade across multiple frames or between
This produces a basic scene ready for some advanced operations, so
next the X3D runtime will evaluate any particle systems, then finally,
if physics, then evaluate the physics model, such as mass and force
and friction. Runtime evaluates any events produced by these
interactions. Repeat as necessary to produce the complete computed
result for the next frame.
Again, this does not mean that the runtime needs to do things in any
particular order, just that the result must follow this idealized
model. Experience has shown that when we have an orderly event system
then stuff just happens in a predictable and transportable way.
Plenty of optimizations may be possible but this is basic and
well-proven best practice model and can be updated to include
appropriate optimization practices where it helps transportability,
authorability, and accessibility.
Since the shape parts of 3D are all made up of pretty much the same
ordering of data, when becoming interactive with WebGL the first real
challenge is creating an event system. If the syntax is integrated
with the DOM, then the DOM interfaces to elements sort of comes for
free. This is the main challenge when consolidating X3D syntax into
the DOM: How to rationalize the event systems? WebGL and X3DOM are
beginning to show how this will work.
These days, with basic 3D support available in most platforms, the
scenegraph, the eventgraph, the user code data wrappers, the
eventtypes, nodetypes, datatypes, authortime and runtime structures,
and event processing details that the author gets to deal with are the
'nostic' features, generally not current shortcomings of host
platforms. With great focus on the GL basics, the layer of WebGL, and
the wider interest in applying 3D to everyday transactions means if we
can define common best practice syntax and structures then we can have
A subset could mean the 3D was a static rendering, or animated along a
time line maybe with interactive aspects. Or whether the htmlized x3d
included special semantics for specialized areas such as CAD step
layers, H-Anim skin, joints, and segments, or Medical rendering
styles, or even solid body physics. X3D has documented best practice
in important areas and should be considered basic study for its
comprehensive coverage of the field.
We are liable to see interesting things happen when we really truely
consolidate X3D syntax as native code producing objects in the
text/html DOM. We all need to study more and understand the html
hierarchy of element types and associated events as related to the X3D
set of definitions.
Fortunately, with all the XML front-end and back-end tools and stuff
available, along with production of a strong accessible live XML DOM,
that is not out of reach.
In summary, the path to realtime interactive 3D started with how to
create and name the data sets, and how to arrange them into structures
suitable for authoring, transport, rendering, event propagation,
animation, networking, efficient update, and extensibility. Like for
HTML, the X3D path continues to evolve as both follow a path to one
web with all the Ds we need anywhere we want to use them.
Thanks to All and Best Regards,
More information about the X3D-Public