[x3d-public] X3D Audio/Sound review minutes: alignment with W3C Web Audio

Don Brutzman brutzman at nps.edu
Wed May 13 10:51:00 PDT 2020


Today we had an excellent meeting to review W3C Audio and X3D Sound Component.

Attendees: Efi Lakka, Thanos Malamos, Dick Puk, Don Brutzman.

There are multiple excellent posts of prior meeting minutes to x3d-public with lots of detail over past two months.  Thanks for continuing feedback, very helpful.

Primary focus areas:
- get node interface hierarchy well defined and correctly matching between Web Audio and X3D,
- consider how to define connections between X3D nodes to build an audio graph.

We discussed many aspects.  Here are some of the questions.

1. The attached diagram is excellent.  Is there a similar "official" diagram for W3C Audio Recommendation?

No other API diagram is known or found.  Efi created this helpful illustration [1], reduced-size version attached.  Apologies for resolution, had trouble getting a small output.

On 5/13/2020 7:21 AM, Eftychia Lakka wrote:
> Dear all,
> 
> here <https://www.dropbox.com/sh/egyegfpxdqnbmj3/AABzilvkqDs3674b-qMDzDGOa?dl=0> you can find some content for the today Audio/Sound review meeting.
> 
> Best Regards,
> Efi Lakka

------

2. What do the colors indicate in your diagram?  Legend is needed please.

Purple is abstract, red are interfaces (no apparent use?) etc.

------

3. We should offer this diagram to the W3C Audio Working Group as part of joining them.  Also learn status of that group, with Efi Thanos and Don joining.

Good goal for next month.

------

4. We should also decorate your diagram to indicate where matching X3D nodes and node types are getting defined.

We should likely strive to match this hierarchy as closely as possible.  Key challenge that will show whether we have reached clarity in our correspondences.

Pros for matching: clearly defined 1::1 functionality for each node/field, with functionality of each specified by W3C CR.

Cons for matching: we should be careful about defining the X3D node hierarchy in API terms.  Must be further careful about matching data types, enumerations, strict mapping to int/float/array value ranges, etc.

------

4. Web Audio's "AudioNode" seems to be a complete misnomer since AudioNode and AudioContext are *abstract types* that cannot be instantiated /per se/.  The name is too general, and a better name seems to be needed.  This is another formal comment to W3C.  Since it is still a Candidate Recommendation (CR) there is time for them to consider renaming.

[2] Web Audio API, W3C Candidate Recommendation, 18 September 2018
     https://www.w3.org/TR/webaudio/

For starters, X3D name mappings for these abstract types would be
- W3C AudioNode    :: X3DAudioNode and
- W3C AudioContext :: X3DAudioContextNode

We need to address abstract node hierarchy design separately from grouping/connections of actual concrete nodes instantiating an audio graph.

Two candidate approaches for grouping/connections among audio nodes:
- ROUTE (awkward and error prone for design, OK for animation modifications) or
- parent-child field relationships for design (simpler, direct).

------

5. How do "W3C audio authors" currently create these audio graphs for HTML - solely in JavaScript or also in XML?

To our knowledge, only Web Audio JavaScript API is being used.

------

6. Do we want to consider support for WebAudio JavaScript as an alternative form of expressing an audio graph?

There are many (perhaps 30+ ?) nodes and interfaces in Web Audio.  Is it really necessary to create matching nodes in X3D??  Let's be sure.

Adopting W3C Audio API functionality "as is" with no functional changes is certainly appealing and worth considering, since it greatly simplifies
- implementation for X3D players,
- creation, sharing, analysis and extension of audio graphs for X3D authors,
- future compatibility if future Web Audio APIs change.

Cons:
- representation concerns for implementations?  Can we map between the two satisfactorily?

------

7. If we emphasized JavaScript usage, what would minimal X3D node set look like?  Whether we implement everything or not, the ability to load such a script is useful for X3D authors.

However we proceed, there will be a connection point between an object producing sound and the X3D scene graph.  Perhaps simplest is loading an audio graph in JavaScript?

First impression:  the outputs of a Web Audio "audio routing graph" produce sound, which seem to map to X3DSoundSourceNode abstract interface.

[2] X3D Tooltips: AudioClip
     https://www.web3d.org/x3d/content/X3dTooltips.html#AudioClip

[3] X3D Tooltips: Sound
     https://www.web3d.org/x3d/content/X3dTooltips.html#Sound

[4] X3D Architecture, 16.3.2 X3DSoundSourceNode
     https://www.web3d.org/documents/specifications/19775-1/V3.3/Part01/versionContent.html#X3DSoundSourceNode

BTW possible Mantis addition, wondering if we should add an 'enabled' field to X3DSoundSourceNode for easier on/off animation?  (please think of this proposed feature next time you are trying to silence a rogue tab playing audio in your web browser...)

Example of current X3D content:
===============================
SoundAudioClip.x3d
http://X3dGraphics.com/examples/X3dForWebAuthors/Chapter12EnvironmentSensorSound/SoundAudioClip.x3d

     <Sound DEF='Audible' location='0 1.6 0' maxBack='20' maxFront='100' minBack='10' minFront='10' priority='1'>
       <AudioClip DEF='WaterSounds' description='Running Water' loop='true'
         url='"aqua.wav"
              "http://X3dGraphics.com/examples/X3dForWebAuthors/Chapter12EnvironmentSensorSound/aqua.wav"'/>
     </Sound>
===============================

A possible alternative sound source (instead of current AudioClip) is to modify AudioClip to accept different sources.

Current Efi example, excerpted:

[3] Web3D Spatial Sound Effects and Filters
     http://www.medialab.hmu.gr/minipages/x3domAudio/filters.xhtml

     <AudioSound>
         <Transform USE='Audio1'/>
         <PannerNode coneInnerAngle='360' coneOuterAngle='360' coneOuterGain='0' distanceModel='inverse'
              maxDistance='10000' panningModel='HRTF' refDistance='1' rolloffFactor='1'></PannerNode>
         <AudioSource loop='true' url='"sound/techno-beat.mp3"'/>
         <BiquadFilterNode frequency='600' detune='50.0' Q='30.0' gain='1.0' type='allpass' />
     </AudioSound>

Not clear why scene-graph structure (e.g. Transform USE Audio1) is in the middle of the audio graph...  Instead the audio graph should be integral (i.e. audio only) and in the middle of the scene graph by itself... Thanos and Efi report that <Transform USE='Audio1'/> simply indicates the position (and implicitly orientation) of coordinate reference frame for the sound source...

And so, more "X3D-like" as follows:

   <Transform DEF='Audio1'/>
     <!-- this is local coordinate reference frame, right here -->
     <AudioSound>
         <!-- note that no X3D scene graph information is needed within AudioSound -->
         <PannerNode coneInnerAngle='360' coneOuterAngle='360' coneOuterGain='0' distanceModel='inverse'
              maxDistance='10000' panningModel='HRTF' refDistance='1' rolloffFactor='1'></PannerNode>
         <AudioSource loop='true' url='"sound/techno-beat.mp3"'/>
         <BiquadFilterNode frequency='600' detune='50.0' Q='30.0' gain='1.0' type='allpass' />
     </AudioSound>
   </Transform>

OK, today's discussion led to another improvement in our shared understanding.  8)

Wondering what does that look like in Web Audio JavaScript API source?  That alternative form is supposed to be identical (if we get the design right).  Thus looking at the corresponding JavaScript might help clarity of understanding, and confirm that the mapping is bidirectional.

TODO: can someone write this script?

Next.  Once we have such matching JavaScript of Web Audio API source, another opportunity appears... can we use it?  Hypothetical simplest-form alternative example:

   <!-- positioned and oriented in local coordinate reference frame of scene graph at origin of sound -->
   <Sound>
     <AudioSound url='"MyWebAudioScript.js"' description='hypothetical alternate usage for an unmodified script'/>
   </Sound>

One very important example where we will always want full expression in X3D (not just embedded .js source) is dramatically demonstrated in Efi's example: events are being sent into the audio graph to change the parameters at run time.

We now have a much better understanding of how to align these many pieces/parts for best effect - and best sound effects!

Good refinement to our design goals:
- keep refining the object interface hierarchy,
- keep refining the modified "X3D-like" example before,
- Parent-child field relationships keep that audio graph terse,
- Ordering of X3D child nodes can indicate much of audio graph structure,
- Showing matching examples in both X3D and JavaScript helps understanding,
- Avoid using ROUTE except when animation external to the audio graph is desired.

Corrections, questions and improvements remain welcome.  Problems continue to gradually yield, onward we go.

We agreed to meet weekly, Wednesday 09-1000 pacific.  Will work with Anita to choreograph meeting schedules on the Web3D calendar and teleconference pages.

Have fun with W3C Audio and X3D Sound!  8)

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: WebAudioInterfaceHierarchyTree.1200x996.png
Type: image/png
Size: 399292 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20200513/0d4b4738/attachment-0001.png>


More information about the x3d-public mailing list