[X3D-Public] x3d html5

Wed Jun 2 19:07:09 PDT 2010

Hi All,

With the current interest in HTML5 and forward as a 'native' home for 
WWW3D, I predict X3D will someday include a profile of some same and 
some revised nodes (elements) and an alternative event system aimed at 
taking advantage of the 'free' DOM provided by consolidation. This is 
demonstrable as providing a great XML-based authortime environment, 
and will work with OGL and WebGL bindings, and also with those hosts 
that may actually integrate advanced higher perfomance nD+1 scenegraph 
and eventgraph rendering context than WebGL connections can provide.

The idea is to work with the W3C and produce an open standard, usable 
is HTML5, for platform-agnostic 3D models.
To extend HTML I think we are showing the idea that basic X3D 
syntax/structure is acceptable and can be consolidated with other html 
syntax. This user code integration means we see code like <x3d> .... 
</x3d> used in a document that includes <p> ...</p> and <svg> ... 
</svg> and other html5 and it works as if highly integrated, even 
seamless, like any other set of related elements in the document.

Authorship and delivery for this already works with extensible 
features using more or less basic XML tools at X3DOM.org. While it is 
a big step to consider freeDOM as part of the current html standard 
(or of the X3D standard), it is representing a standards-track 
approach. The basic package needs development, and wide availability 
of an open, optimized runtime that covers a certain profile of the 
entire X3D node and feature set, as well as some connections to 
related html features like CSS.

To support X3D in HTML, there has to be a system for clear validation 
of the syntax and content models, further supported by multiple free 
and open authortime and runtime implementations. This means 
standards-track XML schema and XML editors and processors, from which 
input for text/html and application/xhtml+xml processors can be 
derived.

Behind and alongside that, there needs to be international agreement 
on the names and types of certain data structures and for certain 
interrelationships between those structures to be standardized. The 
path between those best current practices and, hopefully, best basis 
for future practices, are open and free implementations that support 
authoring and delivery of the interactive pixels that can get produced 
by authors and consumed by others.

The great opportunity is to define this relationship of the existing 
X3D open standard for platform-agnostic 3D models with actual user 
code and using the DOM produced by an HTML5 processor.
My experiences still seem to point to the idea that finally, 
especially when we add in XML, it is the data that is agnostic. 
Proprietary, controlled, and closed non-text solutions for common best 
practices, and encumbered solutions provide the platform restrictions 
(nosticisms?) that limit reuse between platforms. That is, it is all 
numbers and characters, arranged to represent live 1D, 2D, 3D, 4D, and 
yes, even nD+1 manifestations.

Authortime: To be fun, these numbers and characters must be arrangable 
and paramaterizable in some authortime, then enlivened with processing 
by some runtime that accepts the data and produces interactive 
multiple media. To feed this runtime, we want authortime to arrange 
the data into semantic containers that include the idea of readabiity 
by interested humans. After all, most else except screen shots or 
videos are transient; only the actual visible, comprehensible, 
leveragable, authortime user code that is readable and tweakable using 
the most simple utf-n text editor really matters.

Needless to say the optimized runtime structures are pure and can only 
be damaged by faulty user code generated in authortime. That damage 
happens when the basic containers for the data are not named 
consistently, or when the data is in a different architecture than is 
expected by the loader that transmutes the user code into structures 
traversed and updated by the runtime. Today, the syntax and structure 
of the basic data matters even less, with many tools capable of 
automated programatic transcoding between forms, but the problem 
remains: What data structures and semantics to start with when you 
wish to deliver realtime interactive nD running freely on our WWW?

Given there must be a concise binary delivery form that provides 
content security and minimum network bw, but let us start with the 
basic stuff for 3D which is a set of x,y,z coordinates for each vertex 
comprising a shape. There are more than a few ways to declare this in 
authortime, and there should be, but finally, the runtime seems to 
need a data set that actually has one x,y,z coordinate for each vertex 
of each triangle of each shape that can be perceived and interacted 
with. (For each rendering, the runtime shines virtual lights on the 
virtual triangles that make up the virtual shapes and then figures out 
how the next frame can be represented to the interactor.)

Sets of coordinates that make up shapes can be containerized in many 
different ways. And, each of these coordinate points can also include 
other related information, like color, texture, physics, shininess, 
transparency, and so on depending upon your time and budget and your 
needs for realistic rez.

Fortunately, as a foundation for this there is the GL, our venerable, 
modernized OGL, to provide the anchor that shows us requirements and 
best practice for organizing and processing this data to provide 
accelerated rendering. Now, after 'shaders' scripting is added, things 
are different but the same and still the data structure processed by 
the runtime mostly looks at the same stuff (vertex lists comprising 
triangles that reflect, absorb, or transmit light) at some crucial 
point in its realization.

One form used by the GL to work with the data and upward to 
abstractions that are authorable and basically readable in ordinary 
XML text form are shown in the vocabulary and data structures 
supported by X3D. Here, we call a shape a Shape and provide 
connections to the several related data structures that represent 
appearance and animate the shape. It is important to recognize the X3D 
set of  abstractions as providing semantically named paths to carry 
data between authortime and runtime.

Runtime: For high fidelity nD+1 this data is arranged into a graph 
structure with a hierarchy that helps the runtime gracefully update, 
traverse, and render the vizualization. For example, if you are 
building a humanoid character, you want to have the data arranged 
conveniently in order that when you moved a parent shoulder joint, 
then the child arm, hand, and finger segments acted like they are all 
attached - just like you would expect a hierarchy of associated groups 
of objects to respond in real life.

When looking at how to mark up this mostly common data for authorship 
using ISO standard names for the data and giving content models for a 
relatively author-friendly text-based system to define the geometry, 
lighting, navigation, animation, interactivity, and networking for 
consumption by a context that understands an open, well-defined and 
proven scenegraph (call it a DAG), please look at X3D.

Event Flow: In addition to a realistic heirarchy for related shapes 
and interactions within a scene, we also need an event graph that 
allows process flow between different scene elements (call them 
nodes), and even to the rest of creation outside the context of 
current interest. In my experience, it is likewise to events and to 
data. That is, realistically, there is really only one type of event 
that is of interest and that is an event arising on one node that 
exchanges meaningful data with (an)other node(s).

Sure, the events passed directly between nodes and the runtime and 
even the old in/out with the great ever-expanding External are fun, 
but for now just think of one node generating or passing information 
to another node (or call them elements). It is even more fun if one or 
more of those nodes is an active emmitter of some energy, like 
temporality, heat, touch, gravity, or whatever 
environmental/semantic/empathic event that adds information to the 
system.

So, even as the event basics are known, there turns out to be several 
ways to do it. One example is the structures and event systems of the 
HTML/XML DOM. Needless to say there are many example event systems for 
high-tech 3D, since it is not in the foundation of our beloved 
standard, the GL, or in the important ecmascript graphics connection, 
the WebGL. Maybe more of one in O3D? However, given the tools 
supporting WebGL, a tree/graph traversal and event system can be 
derived to produce interactivity. For ISO basics in constructing a 
model that can efficiently produce realistic realtime interactions and 
detailed realistic simulations, please look at the X3D scene access 
interface event graph and data flow sequencing.

The other item important about realtime 3D is that it is actually 
meant to be real time. That is, the next frame of animation is 
rendered as soon as possible, not at some fixed frame per second 
rendering rate. In a typical animation, when the current frame is 
complete and ready to be delivered, we have an opportunity to begin 
accepting and processing new events and begin computing what the next 
frame will be. If the scene is created using a 'single' WebGL script 
that computes or traverses data structures to produce animation of 
x,y,z and time, then that script runs as often as it can, or else 
simulates realtime by refreshing at some target interval.

That leads to the next big step in abstraction. How to extend the GL 
and the WebGL scenegraph computation facilities by plugging in some 
proven high performance compilations of some very practical computing 
assets? Please start with ISO X3D nD interpolators and environmental 
spacetime sensors. Stuff that you soon really want and need, but gets 
very expensive to do even in accelerated parallelized ecmascript.

Again, a choice of what models to use when considering free and open 
extensions should begin with examination of best open practices of 
what is actually implemented as free and open technology in industry 
and hobby applications of basic and realistic simulations and 
transactions.

The Cascade: An important and expansive impulse change in authortime 
and runtime complexity happens when the idea of an event cascade is 
introduced. This is when some event produces a change in several nodes 
as events propagate in preparation for the next frame. Importantly, 
authors may wish to be sure that sets of events are accomplished in 
some predictable sequence, maybe in relation to other events that may 
be forming content for the next and future frames.

It gets complicated fast with sensors and physics so we need an 
appropriate system of data and events that gives a fair mix between 
authoritme and runtime conveniences. This is needed when inspired 
authoring requires a step back from moving the bits around to allow 
focus on modeling humanistic content interactions between virtual 
objects and fields rather than detailed ecmascript manipulations - 
until you really need the script, then you really need it. This also 
implies strong high-level prototyping facility and the ability to 
import or remotely employ and interact with a wide range of external 
assets.

Producing a standard model for realtime interactive local and 
distributed scene and event graphs is currently under long-term 
development. The current ISO X3D offering uses the following model as 
a platform-agnostic way of defining what happens when we get some 
signal (internal/external event) that all is not static and we need 
some new transactions in order to prepare the next frame.

Even though the order in which the events in a particular cascade are 
applied is not considered significant, when an initial event is 
propagated, other nodes may respond by generating event(s). This 
continues until all listening nodes are honored then all are realized. 
This process is called an event cascade. All events generated during a 
given cascade are treated as having the same timestamp as the initial 
event, and all are considered to happen instantaneously. Actually, 
this little detail of the 'instantaneous' event cascade execution is 
what gives us the ability to virtualize reality instant by instant 
(frame-by-frame).

Within this consideration, when the X3D runtime creates a frame it 
must operate as if it follows some rules. First, the runtime updates 
the camera position and orientation, if required, then, if an initial 
event has happened, evaluates the current event cascade. The initial 
event may come from animations in the scene when the runtime asks if 
time or other features need update. If there are no other initial 
events for this frame, then produce the frame.

Or, if there are more initial events that should be considered to 
produce the frame, evaluate the next cascade. These steps are meant to 
give the author and the runtime some versatility but provide an 
authoring path that provides repeatable operations by eliminating 
leakage of results of one cascade across multiple frames or between 
other cascades.

This produces a basic scene ready for some advanced operations, so 
next the X3D runtime will evaluate any particle systems, then finally, 
if physics, then evaluate the physics model, such as mass and force 
and friction. Runtime evaluates any events produced by these 
interactions. Repeat as necessary to produce the complete computed 
result for the next frame.

Again, this does not mean that the runtime needs to do things in any 
particular order, just that the result must follow this idealized 
model. Experience has shown that when we have an orderly event system 
then stuff just happens in a predictable and transportable way. 
Plenty of optimizations may be possible but this is basic and 
well-proven best practice model and can be updated to include 
appropriate optimization practices where it helps transportability, 
authorability, and accessibility.

Since the shape parts of 3D are all made up of pretty much the same 
ordering of data, when becoming interactive with WebGL the first real 
challenge is creating an event system. If the syntax is integrated 
with the DOM, then the DOM interfaces to elements sort of comes for 
free. This is the main challenge when consolidating X3D syntax into 
the DOM: How to rationalize the event systems? WebGL and X3DOM are 
beginning to show how this will work.

These days, with basic 3D support available in most platforms, the 
scenegraph, the eventgraph, the user code data wrappers, the 
eventtypes, nodetypes, datatypes, authortime and runtime structures, 
and event processing details that the author gets to deal with are the 
'nostic' features, generally not current shortcomings of host 
platforms. With great focus on the GL basics, the layer of WebGL, and 
the wider interest in applying 3D to everyday transactions means if we 
can define common best practice syntax and structures then we can have 
3D anywhere.

A subset could mean the 3D was a static rendering, or animated along a 
time line maybe with interactive aspects. Or whether the htmlized x3d 
included special semantics for specialized areas such as CAD step 
layers, H-Anim skin,  joints, and segments, or Medical rendering 
styles, or even solid body physics. X3D has documented best practice 
in important areas and should be considered basic study for its 
comprehensive coverage of the field.

We are liable to see interesting things happen when we really truely 
consolidate X3D syntax as native code producing objects in the 
text/html DOM. We all need to study more and understand the html 
hierarchy of element types and associated events as related to the X3D 
set of definitions.

Fortunately, with all the XML front-end and back-end tools and stuff 
available, along with production of a strong accessible live XML DOM, 
that is not out of reach.

In summary, the path to realtime interactive 3D started with how to 
create and name the data sets, and how to arrange them into structures 
suitable for authoring, transport, rendering, event propagation, 
animation, networking, efficient update, and extensibility. Like for 
HTML, the X3D path continues to evolve as both follow a path to one 
web with all the Ds we need anywhere we want to use them.

Thanks to All and Best Regards,
Joe