[x3d-public] X3D minutes 3 MAR 2023: X3D4 Sound Component and W3C Audio API review
Brutzman, Donald (Don) (CIV)
brutzman at nps.edu
Fri Mar 3 10:45:15 PST 2023
From: Brutzman, Donald (Don) (CIV) <brutzman at nps.edu>
Sent: Friday, March 3, 2023 8:47 AM
To: X3D Public Mailing List (x3d-public at web3d.org) <x3d-public at web3d.org>; Doug Sanden <gpugroup at gmail.com>
Cc: Athanasios Malamos <athanasios.malamos at gmail.com>; Efi Lakka <efilakka at gmail.com>; Brutzman, Donald (Don) (CIV) <brutzman at nps.edu>
Subject: X3D agenda 3 MAR 2023: X3D4 Sound Component and W3C Audio API review
The X3D Working Group meets weekly on Fridays 09-1000 Pacific on Web3D Consortium zoom channel. Telcon information:
* https://us02web.zoom.us/j/81634670698?pwd=a1VPeU5tN01rc21Oa3hScUlHK0Rxdz09
* https://zoom.us/j/148206572 Password 483805
* https://www.web3d.org/member/teleconference-information
1. X3D 2023 goals. Our primary activities for X3D Working Group in 2023 are focused on broad and correct deployment.
a. Encourage consistent rendering, interaction and usage for the many tremendous capabilities in X3D4.
b. Update ISO specifications and implementations for multiple programming languages and file encodings to match X3D4.
2. Web3D 2023 Conference
Vicomtech has followed up on their initial creation of the spinning conference logo a decade ago… it is back!!!
* Web3D 2023, 9-11 October 2023, Vicomtech
* San Sebastian Spain, In-person and Online
* https://web3d.siggraph.org
It will be really interested if we can start collecting and disseminating all 3d models for the conference…
3. Focus: review of X3D4 Sound Component and W3C Audio API.
Our current goal is to identify and remedy and errors or gaps in the X3D4 Architecture with respect to W3C Audio API.
Design goals are attempting to align existing patterns and semantics wherever possible. We are not inventing, we are checking and fixing.
* X3D 4.0 Architecture Draft International Specification (DIS) Clause 4, Sound component
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html
* Web Audio API, W3C Recommendation
* https://www.w3.org/TR/webaudio
As part of implementation efforts for FreeWrl, Doug Sanden has posted a series of public emails identifying possible mismatches or omission.
TimeDependentNode
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-January/018420.html
notes apparent mismatch between X3D timing and W3C Audio timing representations for audio-source nodes.
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/time.html#X3DTimeDependentNode
X3DTimeDependentNode : X3DChildNode {
SFString [in,out] description ""
SFBool [in,out] enabled FALSE
SFBool [in,out] loop FALSE
SFNode [in,out] metadata NULL [X3DMetadataObject]
SFTime [in,out] pauseTime 0 (-∞,∞)
SFTime [in,out] resumeTime 0 (-∞,∞)
SFTime [in,out] startTime 0 (-∞,∞)
SFTime [in,out] stopTime 0 (-∞,∞)
SFTime [out] elapsedTime
SFBool [out] isActive
SFBool [out] isPaused
}
* https://www.w3.org/TR/webaudio/#AudioScheduledSourceNode
interface AudioScheduledSourceNode : AudioNode {
attribute EventHandler onended;
undefined start(optional double when = 0);
undefined stop(optional double when = 0);
};
Matching for sources makes sense as a direct correspondence.
Note that “If enabled field is FALSE, the audio signal passes through unmodified and is not blocked.”
Matching for X3DSoundDestinationNode and X3DSoundProcessingNode was included to provide X3D authors with fine-grained control of nodes including animation options in addition to enabled/disabled. For larger audio graphs impacting multiple outputs from a large diverse scene, such additional flexibility of control seemed to add value. Not considered heavyweight since it is an existing (and commonly needed) interface.
Note that we are not re-designing at this point, rather we are checking for flaws or omissions.
Seems like this is implementable… let’s review carefully please.
Doug believes it is a bad design, but acknowledges that it may be implementable (at this stage of his implementation at least).
Nicholas points out if it is not an exact match with Web Audio API, added functionality might be considered a problem.
Does anyone think that removing X3DTimeDependentNode from X3DSoundProcessingNode blocks any audio-graph authoring in X3D? We found no cases… and Doug has implemented all of them without including it. It also seems to be most conservative approach.
Thanos and Efi, comment please.
Gain
Gain field: we added gain to many nodes for simplification of audio graph construction. It is equivalent to inserting a gain node in between two other nodes.
This issue is similar to the preceding issue in that is strays from precise alignment with Web Audio API, motivated by facilitating X3D authoring.
Resolution should match preceding issue.
Analyser Outputs
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018423.html
* https://medialab.hmu.gr/minipages/x3domAudio/SpatialSoundFilter.xhtml
Audio graph drawing produced by Efi Lakka.
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018423.html
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#Analyser
* https://www.w3.org/TR/webaudio/#AnalyserNode
“It is a mystery to me how we would use it to visualize without access to the data.”
First, a node implementation can provide analytic output, i.e. browser-dependent outputs.
Second, the Web Audio API does include output fields.
undefined getFloatFrequencyData (Float32Array array);
undefined getByteFrequencyData (Uint8Array array);
undefined getFloatTimeDomainData (Float32Array array);
undefined getByteTimeDomainData (Uint8Array array);
So point well taken that we should provide corresponding outputs. How should we define corresponding fields in X3D? Doug’s subsequent suggestion:
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018430.html
frequencyData MFFloat outputOnly
timeDomainData MFFloat outputOnly
with cool example outputs demonstrated
* https://drive.google.com/file/d/1h_ECxw7IwGVvYNblBw9thIEsqMnnZ6a9/view
* https://drive.google.com/file/d/1OWtDr-cvCnwhEDeORUfusiyAaEBL_vuj/view
Resolution? Accept. TODO add prose for X3D specification.
Sound v4 suggestions
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018456.html
DynamicsCompressor: make field type for attack and release the same (both in seconds)
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#DynamicsCompressor
DynamicsCompressor : X3DSoundProcessingNode {
SFFloat [in,out] attack 0.003 [0,∞)
SFTime [in,out] release 0.25 [0,∞)
* https://www.w3.org/TR/webaudio/#DynamicsCompressorNode
readonly attribute AudioParam attack;
readonly attribute AudioParam release;
Wikipedia: Perceptual attack time
* https://en.wikipedia.org/wiki/Perceptual_attack_time
* “Perceptual Attack Time (often abbreviated "PAT") is a subjective measure of the time instant at which a musical sound's rhythmic emphasis is heard.”
Agreed with that change, makes sense that both X3D fields should have type SFTime.
____
ChannelMerger
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#ChannelMerger
* The ChannelMergerNode Interface
* https://www.w3.org/TR/webaudio/#ChannelMergerNode
Channel count often not found in Web Audio API, seems to be implicit. Adding lots of variations to match W3C Audio API details would turn X3D interfaces into a programming interface that required Script support. Thus ChannelMerger and ChannelSplitter simply merge all or split all. ChannelSelector provided to pick just one. Hopefully this provides authoring paths to handle any case.
Suggest we continue if that is workable. Further refinements or extensions are likely best deferred for a possible future X3D 4.1 version once better X3D practice and understanding is established.
Reluctant to change a basic solution to something more involved without further discussion and evidence.
Convolution: allow url loading of buffer field
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#Convolver
Convolver : X3DSoundProcessingNode {
MFFloat [in,out] buffer [] [−1,1]
The buffer field represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved 32-bit linear floating-point PCM values with a normal range of [−1,1], but values are not limited to this range. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the <audio> HTML element and AudioClip.
The buffer field is intended to match the floats found in the AudioBuffer data structure in Web Audio API.
* https://www.w3.org/TR/webaudio/#convolvernode
interface ConvolverNode : AudioNode {
constructor (BaseAudioContext context, optional ConvolverOptions options = {});
attribute AudioBuffer? buffer;
attribute boolean normalize;
};
Seems reasonable to permit a url alternative for buffer – but what formats should be allowed? Should be well defined.
Mozilla Developer Network example shows loading of a WAV file.
* https://developer.mozilla.org/en-US/docs/Web/API/ConvolverNode
One possible resolution: add url field and prose.
* [in out] MFString url [] [WAV]
Alternatively: allow Convolver to load an AudioClip node, limited to WAV format and perhaps restricting/ignoring irrelevant fields.
* [in out] SFNode children [] [AudioClip]
Deserves further consideration.
ListenerPointSource
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018570.html
* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#ListenerPointSource
Spec: “If the dopplerEnabled field is TRUE, ListenerPointSource children sources which are moving spatially in the transformation hierarchy, relative to the location of the ListenerPointSource node, shall apply velocity-induced frequency shifts corresponding to Doppler effect.”
ListenerPointSource can be considered as a “virtual microphone” listening (directionally) somewhere else in the world.
The word “children” and phrase “relative to the location of the ListenerPointSource node,“ should be omitted since there are no children. TODO improve that sentence.
What else should we say in the node definition? Hopefully that will be sufficient to remove any remaining ambiguity.
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018572.html
ListenerPointSource creates (is a source of) audio streams, based on its location and orientation in the scene. No input field is needed.
* http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018573.html
“Q6. while paused or stopped, should SAD stream be buffered/queued or lost? If buffered, should there be a buffer size limit?”
Defer to Web Audio API, we are not adding further constraints or limitations.
We will resume this technical comparison next week. Dick will also review how any changes deemed essential are best communicated to ISO.
Thanks for impressive efforts. Ringing loud and clear: “Have Fun With X3D!” 8)
all the best, Don
--
Don Brutzman Naval Postgraduate School, Code USW/Br brutzman at nps.edu <mailto:brutzman at nps.edu>
Watkins 270, MOVES Institute, Monterey CA 93943-5000 USA +1.831.656.2149
X3D graphics, virtual worlds, navy robotics, data http://faculty.nps.edu/brutzman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20230303/3a606f64/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5464 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20230303/3a606f64/attachment-0001.p7s>
More information about the x3d-public
mailing list