[x3d-public] X3D4 Sound meeting 30 SEP 2020: Web3D 2020 preparations, Gain and ChannelSelector nodes, avoiding channel indices via parent-child MFNode field relationships

Sun Oct 4 19:15:25 PDT 2020

[Summary: all proposed spec changes applied, final-review draft copy of Sound component attached.]

First, one unresolved issue relating to most all abstract types and nodes.

"TODO: if enabled FALSE, does signal pass through unmodified or is it blocked? Perhaps an additional boolean is needed for pass-through state? Modeling the 'connect' attribute and defining defaults is necessary for each case."

Thoughts?  Is there something in Web Audio API about this?  Only finding 'enabled' in there once, but it seems to indicate that we might need fields for 'live' and 'muted' as well.

================================
1.5.3. AudioNode Lifetime

https://www.w3.org/TR/webaudio/#AudioNode-actively-processing

A MediaStreamAudioSourceNode or a MediaStreamTrackAudioSourceNode are actively processing when the associated MediaStreamTrack object has a readyState attribute equal to "live", a muted attribute equal to false and an enabled attribute equal to true.
================================

On 10/2/2020 6:59 AM, Athanasios Malamos wrote:
> My disagreement has to do with simplicity. Assume you have ads input an Dolby Atmos sound with up to 64 channels to split. Then each time you want to process one channel you have to rewrite all the channels. I don't think it is efficient. We need channel selector or else channel splitter to have as output a MFValue object where each of the channels has each own index in this mfvalue.

Excellent rationale, thanks.  Yes let's keep the node and keep striving for simplest possible use of indexing, continuing with current approach that rarely needs indexing.

We definitely have to be careful with phrasing about channels, since (as you indicated above) a single channel may consist of numerous contained channels.

Have applied Gain and ChannelSelector node to draft X3D4 Sound Component, according to these definitions you've provided Efi.  Detailed notes follow.

a. Gain appears to be a multiplicative factor, not decibels.

[1] Web Audio API, 1.20. The GainNode Interface
     https://www.w3.org/TR/webaudio/#gainnode

- default values are almost always 1.
- Curiously no indication that negative values are allowed for signal negation.
- Two values specifically identified as decibels:

===========================
1.27. The PannerNode Interface, 1.27.2. Attributes
https://www.w3.org/TR/webaudio/#PannerNode-attributes

"coneOuterGain, of type double
     A parameter for directional audio sources that is the gain outside of the coneOuterAngle. The default value is 0. It is a linear value (not dB) in the range [0, 1]. An InvalidStateError MUST be thrown if the parameter is outside this range."

- numerous mentions of dB throughout the recommendation.

===========================
1.13.4. BiquadFilterOptions

https://www.w3.org/TR/webaudio/#dom-biquadfiltertype-lowshelf and highshelf and peaking

gain, of type AudioParam, readonly
    The gain of the filter. Its value is in dB units. The gain is only used for lowshelf, highshelf, and peaking filters.

===========================
1.19. The DynamicsCompressorNode Interface

https://www.w3.org/TR/webaudio/#DynamicsCompressorNode-attributes

reduction, of type float, readonly
A read-only decibel value for metering purposes, representing the current amount of gain reduction that the compressor is applying to the signal. If fed no signal the value will be 0 (no gain reduction).
===========================

X3D4 specification prose now says dB for these two cases, and not dB elsewhere.

I think we should recommend to W3C Audio Working Group that more explicit identification of gain in dB or gain (factor for amplitude) should be indicated in the recommendation.

---

b. I was thinking it might be good to rename all 'gain' fields as 'gainOutput' for clarity that any gain factor is applied last.  I did not make such a change, staying consistent with Web Audio API (and avoiding further potential naming ambiguity with decibels).

Added 'gain' field to a majority of node types and nodes.  Please check them.

Problem with adding 'gain' to X3DSoundProcessingNode is that it has a default value of 1 but has a different signature for BiquadFilter gain default of 0 dB.  Not sure what to do about this yet, will think about how we note this distinction.

Curiously, adding gain to the abstract interfaces also adds gain to AudioClip and MovieTexture... seems useful.  What do you think?

---

c. Changed specification's editorial note

	# Mechanisms for parent-child input-output graph design remain under review

by appending

	MFNode [in out] inputs  NULL  [X3DSoundAnalysisNode,X3DSoundChannelNode,X3DSoundProcessingNode,X3DSoundSourceNode]

Whenever unambiguous we are silent about output channels, the node itself provides those.  For example, the X3DSoundProcessingNode nodes take an array of node 'inputs' and processes the signals of each, providing similar outputs.

d. From WD2 draft specification: "TODO: do most or all interfaces include a gain field?"

 From our discussions, the answer was clearly yes.  The inclusion of Gain node considered a helpful addition, not a replacement.

Added gain field throughout; different default of 0 for BiquadFilter.

   SFInt32  [in,out] gain   1    [-infinity,infinity)

---

e. Added Gain and ChannelSelector to Table 5 support levels.

---

f. X3DAnalysisNode, Analyser

This node is only allowed to have 0 or 1 input channels, which are available to be passed through.

* 1.8. The AnalyserNode Interface
   https://www.w3.org/TR/webaudio/#analysernode

Added enabled field and SFNode inputs:

   # Mechanisms for parent-child input-output graph design remain under review
   MFNode   [in out] inputs                NULL       [X3DSoundAnalysisNode,X3DSoundChannelNode,X3DSoundProcessingNode,X3DSoundSourceNode]

---

continuing:

On 10/2/2020 5:56 AM, Eftychia Lakka wrote:
> 
> Dear all,
> 
> sorry for the late reply.
> 
> The proposal for Gain and ChannelSelector node is:
> 
> 16.4.11 *Gain* : X3DSoundProcessingNode {
> 
> SFString [in,out] description              ""
> 
>        SFFloat  [in,out] gain                     1             [-∞,∞)
> 
>        SFInt32  [in,out] channelCount             0             [0,∞)
> 
>        SFString [in,out] channelCountMode         "max"         ["max", "clamped-max", "explicit"]
> 
>        SFString [in,out] channelInterpretation     "speakers"   ["speakers", "discrete"]
> 
>        SFInt32  [in,out] numberOfInputs            0            [0,∞)
> 
>        SFInt32  [in,out] numberOfOutputs           0            [0,∞)}

Revised the following:

> The gain field represents the amount of gain to apply. Its default value is 1.

as:

===========================
16.2.4 Sound effects processing

     The Gain node amplifies or deamplifies the input signal.

     The <i>gain</i> field is a factor that represents the amount of linear amplification to apply.
     Decibel values shall not be used.
     Negative gain factors negate the input signal.
===========================

===============================================================================================================
> 
> 16.4.7 *ChannelSelector *: X3DSoundChannelNode {
> 
>        SFInt32  [in out] channelNumber            0            [0,∞)
> 
> SFNode   [in out] channelSplitter                       [ChannelSplitter]
> 
> SFString [in,out] description              ""
> 
> SFInt32  [in,out] channelCount             0            [0,∞)
> 
>        SFString [in,out] channelCountMode         "max"        ["max", "clamped-max", "explicit"]
> 
>        SFString [in,out] channelInterpretation     "speakers"  ["speakers", "discrete"]
> 
>        SFInt32  [in,out] numberOfInputs            0           [0,∞)
> 
>        SFInt32  [in,out] numberOfOutputs           0           [0,∞)}
> 
> The channelNumber field determines a single channel from a ChannelSplitter node.
> 
> The channelSplitter field specifies with which ChannelSplitter node is linking.

add MFNode inputs for consistency.

renamed channelSplitter as outputChannel for clarity, further improvements welcome.

added the other X3DSoundChannelNode fields for consistency.

======================================================
16.4.7 ChannelSelector
ChannelSelector : X3DSoundChannelNode {
   SFString [in,out] description  ""
   SFBool   [in,out] enabled      TRUE
   SFFloat  [in,out] gain         1     [0,∞)
   SFBool   [in,out] loop         FALSE
   SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
   SFTime   [in,out] pauseTime    0     (-∞,∞)
   SFTime   [in,out] resumeTime   0     (-∞,∞)
   SFTime   [in,out] startTime    0     (-∞,∞)
   SFTime   [in,out] stopTime     0     (-∞,∞)
   SFTime   [out]    elapsedTime
   SFBool   [out]    isActive
   SFBool   [out]    isPaused

   SFInt32  [in,out] channelNumber         0          [0,∞)

   SFInt32  [out] channelCount          0          [0,∞)
   SFString [in,out] channelCountMode      "max"      ["max", "clamped-max", "explicit"]
   SFString [in,out] channelInterpretation "speakers" ["speakers", "discrete"]
   # Mechanisms for parent-child input-output graph design remain under review
   MFNode   [in out] inputs                NULL        [X3DSoundAnalysisNode,X3DSoundChannelNode,X3DSoundProcessingNode,X3DSoundSourceNode]
}
ChannelSelector selects a single channel output from all inputs.

The channelNumber field indicates which channel to select, with index values beginning at 0.
======================================================

> Best regards,
> Efi Lakka
> 
> 
> 
> Στις Πέμ, 1 Οκτ 2020 στις 8:59 μ.μ., ο/η Don Brutzman <brutzman at nps.edu <mailto:brutzman at nps.edu>> έγραψε:
> 
>     Dick and I continued today.
> 
>     Here is a way to avoid any need for a ChannelSelector node by using the available ChannelSplitter node:
> 
>     <ChannelSplitter DEF='ChannelSelectionDemo' channelCountMode = 'explicit'>
>           <AudioBufferSource DEF='AudioBufferSource2'/
>           <Gain DEF='IgnoreChannel_0'  containerField='outputs'/> <!-- initial output is audio channel 0 -->
>           <Gain DEF='IgnoreChannel_1'  containerField='outputs'/> <!-- second  output is audio channel 1 -->
>           <Gain USE='SelectGain_other' containerField='outputs'/> <!-- third   output is audio channel 2 -->
>     <ChannelSplitter/>
> 
>     Is this sufficient to avoid defining a new ChannelSelector node in the specification?  Seems so.  Examples will reveal it is really needed.

agreed with rationale at top of this message - have added ChannelSelector node.

>     It looks like we have solutions that avoid the need for ROUTE connections, or index numbers, to create an audio graph.  That meets our hoped-for design goals of simplicity.
> 
>     This means that ROUTE connections can be dedicated to the purpose of animating an audio graph: enable on/off, changing gain, etc.

yes still the case.  pretty exciting!

>     X3D4 specification:
> 
>      > 16.4.6 ChannelMerger
>      > ChannelMerger : X3DSoundChannelNode {
>      >   SFString [in,out] description  ""
>      >   SFBool   [in,out] enabled      TRUE
>      >   SFBool   [in,out] loop         FALSE
>      >   SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
>      >   SFTime   [in,out] pauseTime    0     (-∞,∞)
>      >   SFTime   [in,out] resumeTime   0     (-∞,∞)
>      >   SFTime   [in,out] startTime    0     (-∞,∞)
>      >   SFTime   [in,out] stopTime     0     (-∞,∞)
>      >   SFTime   [out]    elapsedTime
>      >   SFBool   [out]    isActive
>      >   SFBool   [out]    isPaused
>      >
>      >   SFInt32  [in,out] channelCount          0          [0,∞)
>      >   SFString [in,out] channelCountMode      "max"      ["max", "clamped-max", "explicit"]
>      >   SFString [in,out] channelInterpretation "speakers" ["speakers", "discrete"]
>      >   SFInt32  [in,out] numberOfInputs        0          [0,∞)
>      >   SFInt32  [in,out] numberOfOutputs       0          [0,∞)
>      >   # Mechanisms for parent-child input-output graph design remain under review
>      > }
>      > ChannelMerger unites different monophonic input channels into a single output channel.
> 
>     Here are needed updates for ChannelMerger:
> 
>     MFNode [in out] inputs [X3DSoundProcessingNode] # multiple inputs

added

>     SFNode [in out] output [X3DSoundProcessingNode] #   single output

actually no SFNode output needed, not added. the node itself is used as part of next node's inputs array.  same results, terser.

>     If we only have single output, then numberOfOutputs field is no longer needed.
> 
>      > 16.4.7 ChannelSplitter
>      > ChannelSplitter : X3DSoundChannelNode {
>      >   SFString [in,out] description  ""
>      >   SFBool   [in,out] enabled      TRUE
>      >   SFBool   [in,out] loop         FALSE
>      >   SFNode   [in,out] metadata     NULL  [X3DMetadataObject]
>      >   SFTime   [in,out] pauseTime    0     (-∞,∞)
>      >   SFTime   [in,out] resumeTime   0     (-∞,∞)
>      >   SFTime   [in,out] startTime    0     (-∞,∞)
>      >   SFTime   [in,out] stopTime     0     (-∞,∞)
>      >   SFTime   [out]    elapsedTime
>      >   SFBool   [out]    isActive
>      >   SFBool   [out]    isPaused
>      >
>      >   SFInt32  [in,out] channelCount          0          [0,∞)
>      >   SFString [in,out] channelCountMode      "max"      ["max", "clamped-max", "explicit"]
>      >   SFString [in,out] channelInterpretation "speakers" ["speakers", "discrete"]
>      >   SFInt32  [in,out] numberOfInputs        0          [0,∞)
>      >   SFInt32  [in,out] numberOfOutputs       0          [0,∞)
>      >   # Mechanisms for parent-child input-output graph design remain under review
>      > }
>     Needed for ChannelSplitter:
> 
>     SFNode [in out] inputs  [X3DSoundProcessingNode] #   single input
>     MFNode [in out] outputs [X3DSoundProcessingNode] # multiple outputs

added

>     Since these field definitions can vary, they would not go in parent abstract interface X3DSoundProcessingNode.

confirmed

>     Field numberOfInputs can be omitted, only a single input node.

applied (with style "proposedDeletion" for clear reviewing)

>     This probably means that channelCount, numberOfOutputs are accessType outputOnly [out].  The values are determined by the node children for each field.

actually it looks like channelCount and numberOfOutputs are identical with this approach.  Therefore applied proposedDeletion for numberOfOutputs field.

>     If agreed that ChannelSelector is superfluous, then the only remaining node that needs to be added to specification is Gain.

as earlier, both are now included.

>     ... but more work is needed on Sound and SpatialSound.

Dick though we shouldn't overload the SFNode 'source' field... seems prudent.  Added the same inputs field as other nodes, so that audio graph construction can feed either Sound or SpatialSound (i.e. X3DSoundSourceNode).

<span class="editorsNote">  # Mechanisms for parent-child input-output graph design remain under review
   MFNode   [in out] inputs            NULL      [X3DSoundAnalysisNode,X3DSoundChannelNode,X3DSoundProcessingNode,X3DSoundSourceNode]</span>

>     We need an X3D definition for "audio graph" term.  Suggested draft:
> 
>     * An /audio graph/ is a collection of nodes structured to process audio inputs and outputs
>         in a manner that is constrained to match the structure allowed by the Web Audio API.

Added to Glossary as definition 3.1.3.  Looks like simpler version is appropriate there. Refinements welcome.

	3.1.3
	audio graph
	structured collection of nodes that process audio inputs and outputs

We have already added Web Audio API to Normative references.html as [W3C-WebAudio] since necessary functionality is found there.

>     We have defined all of the new nodes (beyond Sound, Spatial Sound and AudioClip) to match the terms and capabilities of Web Audio API.

I think we are in good shape.

>     This means a collection of the new nodes, that together can create and process sound, produce a result that needs to inputs for our Sound and SpatialSound nodes.  In combination, the output is similar to a computational version of a simple AudioClip node. It is a source, computationally created, whereas the AudioClip is a prerecorded version.
> 
>     ========================================
>     Basic stages for flow of sound, from source to destination:
> 
>     a. Sources of sound (perhaps an audio file or MicrophoneSource, perhaps signal processing of channels in audio graph),
> 
>     b. X3D Sound or SpatialSound node (defining location direction and characteristics of expected sound production in virtual 3D space),
> 
>     c. Propagation (attenuation model, may be modified by AcousticProperties based on surrounding geometry),
> 
>     d. Reception point (avatar "ears" or recordable listening point at some location and direction, that "hears" result, with left-right pan and spatialization).
>     ========================================

This seems important to add to specification.  Not finding a top-level overview there... inserted it as

===========================================
16.2.1 Audio and sound spatial architecture

The Sound component provides a rich set of spatialized audio capabilities in a comprehensive architecture suitable for 3D models and virtual environments.

a. Signal sources for sound. In addition to playing inputs from prerecorded sound files, capabilities are provided for computational audio generation, microphones, and virtual listening points.
b. Virtual locations for sound generation. The Sound and SpatialSound nodes define location, direction, and characteristics of expected sound production in virtual 3D space.
c. Propagation. Attenuation model may be further modified by AcousticProperties of reflection, refraction and absorption based on surrounding geometry.
d. Reception points. Avatar-centered listening points and recordable listening points, each with arbitrary location and direction, can receive acoustic results modified by corresponding with left-right pan and spatialization.
===========================================

>     Producing a figure for this partitioning would be a good idea.

Still thinking about it.

>     Issue: should Sound or SpatialSound only receive a single channel?  Seems realistic... but note that AudioClip and MovieTexture allow stereo (2 channel) outputs.  So such a restriction on channel count doesn't seem appropriate, except perhaps to have a limit of two channels.  Perhaps best to have spec stay silent and let implementations continue to handle # channels implicitly.

Still holds true.  Good point to keep in mind during spec review.

>     The Sound and SpatialSound nodes define how sound is manifested within the X3D scene. (Other nodes don't do that).
> 
>     For inputs to X3D Sound and SpatialSound, the 'source' field might be changed to (a) allow multiple inputs as MFNode, and (b) allow other computational sources.  In other words, change
> 
>              SFNode   [in,out] source NULL [X3DSoundSourceNode] # and other types
>     to
>              MFNode   [in,out] source NULL [X3DSoundSourceNode,X3DSoundDestinationNode,X3DSoundProcessingNode]
> 
>     or else a combination of inputs and outputs such as
> 
>              SFNode [in,out] source      NULL [X3DSoundSourceNode]
>              MFNode [in,out] processing  NULL [X3DSoundProcessingNode]
>              SFNode [in,out] destination NULL [X3DSoundDestinationNode]
>              SFNode [in,out] analysis    NULL [X3DSoundAnalysisNode]

Cleanest approach seems to be leaving 'source' field alone and adding 'inputs' field as described above.

>     Confirmed (as discussed yesterday) that X3DSoundProcessingNode interface needs
> 
>              SFFloat [in,out] outputGain 1.0

applied, but kept term 'gain' as described above for best symmetry with Web Audio API.

>     Whew, more to go!  We'll apply these changes to spec to simplify the discussion and advance our shared comprehension.

OK.  I think all the changes are applied, and we are finished with design of audio and sound component additions.

Only open tech issue is at top, regarding muting if enabled is off.

After that, only remaining steps are review, implementation and evaluation.

To facilitate review, have attached PDF of tonight's Sound component snapshot.  All feedback welcome.

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Extensible 3D (X3D), ISO_IEC 19775-1_202x, 16 Sound component.reduced.pdf
Type: application/pdf
Size: 668895 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20201004/0b763a04/attachment-0001.pdf>