[x3d-public] My Experience Implementing V4C Sound in FreeWRL

Sat Mar 25 14:39:03 PDT 2023

My Experience Implementing Sound V4Corrigenda in FreeWRL

V4 - draft web3d v4 as submitted to ISO, what I would call a garbage draft

V4 Corrigenda Draft 7 - corrected in part by zoom meeting consensus, and in
part by Doug unilaterally

In 2020 I found Labsound - a native library, a fork of the audio part of
WebKit, that would allow implementing web audio from native code.

- I downloaded, built but couldn't link directly to freeWRL - link
conflicts - so I isolated in a DLL and passed parameters.

- I was able to get a sound to play from it, although unrelated to the
scene, but strong evidence I could use it from freewrl.

I did nothing else with sound -delaying and cleaning up other functions-
until December 2022, and as a result of my choice to delay, I missed the
meetings where decisions were made on the spec design. Now in 2023 I'm
'late to the party'

In the last 3 months I've dedicated my free time to implementing the sound
component.

In theory it should be a thin wrapper over web audio API. In practice
converting the procedural code of web api to a declarative format has some
design decisions. The biggest decision they made was to use the scenegraph
to represent connections between audio nodes. That seems to work well.

But the designers for V4 included lots of bloat - unnecessary and
hard-to-implement fields, as well as missing fields. Recent zoom meeting
consensus has removed much of the bloat, but too late for v4 Draft which
has already been submitted to ISO, I'm told, and any technical changes
would need to wait for either a Corrigendum to v4 - taking 1.5 years - or
wait for v4.1 -taking 4 to 5 years.

Assuming there will be a Corrigendum to v4, I've implemented what I think
the Corrigenda should be. Not v4 - I don't recommend implementing v4 Draft
as written today.

The tricky part about connections between audio nodes: the channel nodes:
splitter, merger, selector.

Any audio node can be a child of ChannelMerger (except AudioDestination)
and so all nodes need to be ready to connect to a destination channel other
than default 0.

When I implemented, I left the channel nodes till last, and found it wasn't
too hard to modify my connection code to accommodate channel nodes, but did
need work. I found a child needs to check connections to parent on each
render, in case it's a new parent (just getting started, or a Switch node
being switched, or DEF/USE multiple parents) or in case the destination
channel on the parent has changed indicating a disconnect from the last
destination, and connect to the new destination is needed -- which can
happen when routing to ChannelSelector.channelDestination.

Most of the processing nodes were close to web audio API and consulting the
MDN documentation for the web audio equivalent would shed light on what the
node was supposed to be doing, and give example code snippets. Google: web
audio api <node name>

https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer

As I write there's still debate over Convolver, BufferAudioSource, and
AudioClip. I implemented Convolver to delegate to (new node) AudioBuffer
node (like web audio api AudioBuffer), and BufferAudioSource (like web
audio api AudioBufferSource) to also delegate to AudioBuffer, for loading
from URL or taking MFFloat raw PCM32 data directly.

There's overlap between BufferAudioSource and AudioClip, and should one or
the other be retired, or at least Sound and SpatialSound take either?
Debate not yet settled.

There's a few nodes I can't implement with Labsound:
StreamAudioDestination, StreamAudioSource - both of which appear  to pull
from and push to HTML tag mediaStreams, and I don't have HTML in freewrl.

ListenerPointSource - not in web audio api, a creation of the Greek
originators - as documented seemed to be related to media streams, and in
general ill defined, no one has explained how it would work in sufficient
clarity yet to implement consistently across browsers. I didn't see it
implemented in x3dom as an audio source node, rather as a simpler
ListenerPoint. In my Corrigenda Draft 7 I replaced ListenerPointSource with
a simpler ListenerPoint which affects Panner nodes (in Sound and
SpatialSound node implementations) by replacing viewpoint 0,0,0 with
ListenerPoint pose.

Sound and AudioClip are 'historical' nodes and so to stay compatible with
their historic usage, if Sound has no parent audio node, then I have it
create its own AudioDestination node. If it has a parent audio node, then
it doesn't, so it can work either way as part of a bigger audio graph, or
in the historical Sound - AudioClip pair.

I still have bugs in freewrl, and some in Labsound I've raised issues on
and are registered bugs for Labsound.

Overall, it seems like a wow big upgrade to sound component and something
web3d community can be proud to support.

Recommendation to other implementers: wait for consensus-approved
Corrigendum Draft, or v4.1, before lifting a finger to implement, as the
current v4 Draft spec is what I would call a garbage draft.

-Doug

x3dom implementation - a copies I gathered

https://freewrl.sourceforge.io/tests/16_Sound/x3dom/

- appears to be pilot / prototype code, not yet generalized for production
code.

freewrl implementation

https://sourceforge.net/p/freewrl/git/ci/develop/tree/freex3d/src/lib/scenegraph/Component_Sound.c

- the freewrl side, with render_<node>(), update_connections()
push_parent()

https://sourceforge.net/p/freewrl/git/ci/develop/tree/freex3d/src/libsound/libsound.cpp

https://sourceforge.net/p/freewrl/git/ci/develop/tree/freex3d/src/libsound/libsound.h

- wrapper on Labsound handling initialization and updating of web audio
nodes based on web3d nodes passed in.

- and connect / disconnect / initialize context
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20230325/0b727914/attachment.html>