[x3d-public] X3D agenda 3 MAR 2023: X3D4 Sound Component and W3C Audio API review

Fri Mar 3 08:47:21 PST 2023

The X3D Working Group meets weekly on Fridays 09-1000 Pacific on Web3D Consortium zoom channel.  Telcon information:

*	https://us02web.zoom.us/j/81634670698?pwd=a1VPeU5tN01rc21Oa3hScUlHK0Rxdz09 
*	https://zoom.us/j/148206572  Password 483805 
*	https://www.web3d.org/member/teleconference-information 

1.	X3D 2023 goals.  Our primary activities for X3D Working Group in 2023 are focused on broad and correct deployment.

a.	Encourage consistent rendering, interaction and usage for the many tremendous capabilities in X3D4.
b.	Update ISO specifications and implementations for multiple programming languages and file encodings to match X3D4.

2.	Focus: review of X3D4 Sound Component and W3C Audio API.

Our current goal is to identify and remedy and errors or gaps in the X3D4 Architecture with respect to W3C Audio API.

*	X3D 4.0 Architecture Draft International Specification (DIS) Clause 4, Sound component
*	https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html

*	Web Audio API, W3C Recommendation
*	https://www.w3.org/TR/webaudio

As part of implementation efforts for FreeWrl, Doug Sanden has posted a series of public emails identifying possible mismatches or omission.

TimeDependentNode

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-January/018420.html

notes apparent mismatch between X3D timing and W3C Audio timing representations for audio-source nodes.

*	https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/time.html#X3DTimeDependentNode

X3DTimeDependentNode : X3DChildNode {
  SFString [in,out] description  ""
  SFBool   [in,out] enabled      FALSE
  SFBool   [in,out] loop         FALSE
  SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
  SFTime   [in,out] pauseTime    0     (-∞,∞)
  SFTime   [in,out] resumeTime   0     (-∞,∞)
  SFTime   [in,out] startTime    0     (-∞,∞)
  SFTime   [in,out] stopTime     0     (-∞,∞)
  SFTime   [out]    elapsedTime
  SFBool   [out]    isActive
  SFBool   [out]    isPaused
}

*	https://www.w3.org/TR/webaudio/#AudioScheduledSourceNode

interface AudioScheduledSourceNode : AudioNode <https://www.w3.org/TR/webaudio/#audionode>  {
  attribute EventHandler <https://html.spec.whatwg.org/multipage/webappapis.html#eventhandler>  onended <https://www.w3.org/TR/webaudio/#dom-audioscheduledsourcenode-onended> ;
  undefined <https://heycam.github.io/webidl/#idl-undefined>  start <https://www.w3.org/TR/webaudio/#dom-audioscheduledsourcenode-start> (optional double <https://heycam.github.io/webidl/#idl-double>  when <https://www.w3.org/TR/webaudio/#dom-audioscheduledsourcenode-start-when-when>  = 0);
  undefined <https://heycam.github.io/webidl/#idl-undefined>  stop <https://www.w3.org/TR/webaudio/#dom-audioscheduledsourcenode-stop> (optional double <https://heycam.github.io/webidl/#idl-double>  when <https://www.w3.org/TR/webaudio/#dom-audioscheduledsourcenode-stop-when-when>  = 0);
};

Design goals are attempting to align existing patterns and semantics wherever possible.

Matching for sources makes sense as a direct correspondence.

Matching for X3DSoundDestinationNode and X3DSoundProcessingNode was included to provide X3D authors with fine-grained control of nodes including animation options in addition to enabled/disabled.  For larger audio graphs impacting multiple outputs from a large diverse scene, such additional flexibility of control seemed to add value.  Not considered heavyweight since it is an existing (and commonly needed) interface.

Note that we are not re-designing at this point, rather we are checking for flaws or omissions.

Seems like this is implementable… let’s review carefully please.

Analyser Outputs

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018423.html
*	https://medialab.hmu.gr/minipages/x3domAudio/SpatialSoundFilter.xhtml

Audio graph drawing produced by Efi Lakka.

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018423.html
*	https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#Analyser
*	https://www.w3.org/TR/webaudio/#AnalyserNode

“It is a mystery to me how we would use it to visualize without access to the data.”

First, a node implementation can provide analytic output, i.e. browser-dependent outputs.

Second, the Web Audio API does include output fields.

  undefined getFloatFrequencyData (Float32Array array);

  undefined getByteFrequencyData (Uint8Array array);

  undefined getFloatTimeDomainData (Float32Array array);

  undefined getByteTimeDomainData (Uint8Array array);

So point well taken that we should provide corresponding outputs.  How should we define corresponding fields in X3D?  Doug’s subsequent suggestion:

*        http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018430.html

frequencyData MFFloat outputOnly
timeDomainData MFFloat outputOnly

with cool example outputs demonstrated

*        https://drive.google.com/file/d/1h_ECxw7IwGVvYNblBw9thIEsqMnnZ6a9/view
*        https://drive.google.com/file/d/1OWtDr-cvCnwhEDeORUfusiyAaEBL_vuj/view

Resolution?

Sound v4 suggestions

* http://web3d.org/pipermail/x3d-public_web3d.org/2023-February/018456.html

DynamicsCompressor: make field type for attack and release the same (both in seconds)

*  https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#DynamicsCompressor

DynamicsCompressor : X3DSoundProcessingNode {

  SFFloat   [in,out] attack                0.003      [0,∞)

  SFTime   [in,out] release               0.25       [0,∞)

* https://www.w3.org/TR/webaudio/#DynamicsCompressorNode

  readonly attribute AudioParam attack;

  readonly attribute AudioParam release;

Wikipedia: Perceptual attack time

* https://en.wikipedia.org/wiki/Perceptual_attack_time

* “Perceptual Attack Time (often abbreviated "PAT") is a subjective measure of the time instant at which a musical sound's rhythmic emphasis is heard.”

Agreed with that change, makes sense that both X3D fields should have type SFTime.

____

ChannelMerger

* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#ChannelMerger

*  The ChannelMergerNode Interface

* https://www.w3.org/TR/webaudio/#ChannelMergerNode

Channel count often not found in Web Audio API, seems to be implicit.  Adding lots of variations to match W3C Audio API details would turn X3D interfaces into a programming interface that required Script support.  Thus ChannelMerger and ChannelSplitter simply merge all or split all.  ChannelSelector provided to pick just one.  Hopefully this provides authoring paths to handle any case.

Suggest we continue if that is workable.  Further refinements or extensions are likely best deferred for a possible future X3D 4.1 version once better X3D practice and understanding is established.

Convolution: allow url loading of buffer field

* https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-DIS/Part01/components/sound.html#Convolver

Convolver : X3DSoundProcessingNode {

  MFFloat  [in,out] buffer                []         [−1,1]

  The buffer field represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved 32-bit linear floating-point PCM values with a normal range of [−1,1], but values are not limited to this range. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the <audio> HTML element and AudioClip.

* https://www.w3.org/TR/webaudio/#convolvernode

interface ConvolverNode : AudioNode {

  constructor (BaseAudioContext context, optional ConvolverOptions options = {});

  attribute AudioBuffer? buffer;

  attribute boolean normalize;

};

Seems reasonable to permit a url alternative for buffer – but what formats should be allowed?  Should be well defined.

ListenerPointSource

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018570.html

Spec: “If the dopplerEnabled field is TRUE, ListenerPointSource children sources which are moving spatially in the transformation hierarchy, relative to the location of the ListenerPointSource node, shall apply velocity-induced frequency shifts corresponding to Doppler effect.”

The word “children” and phrase “relative to the location of the ListenerPointSource node,“ should be omitted since there are no children.

Hopefully that is sufficient to remove any remaining ambiguity.

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018572.html

ListenerPointSource creates (is a source of) audio streams, based on its location and orientation in the scene.  No input field is needed.

*	http://web3d.org/pipermail/x3d-public_web3d.org/2023-March/018573.html

“Q6. while paused or stopped, should SAD stream be buffered/queued or lost? If buffered, should there be a buffer size limit?”

Defer to Web Audio API, we are not adding further constraints or limitations.

Anything else Doug?

all the best, Don

-- 

Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu

Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA  +1.831.656.2149

X3D graphics, virtual worlds, navy robotics, data  http://faculty.nps.edu/brutzman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20230303/e4aeeede/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5464 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20230303/e4aeeede/attachment-0001.p7s>