[x3d-public] X3D Specification Editors: Audio and Sound, more progress and issues to resolve

Don Brutzman brutzman at nps.edu
Tue Jul 21 22:55:46 PDT 2020

Some more comments for group resolution.  Most nodes have definitions and interfaces in github.

We still have a number of gaps.  If we get the biggest sorted out, and have up-to-date field definitions, then am hoping we have a sufficiently mature "work in progress" (backed up by Web Audio API CR) for draft publication.

Look below for 4..8, kept in order for simplest review and update.  See you Wednesday.


On 7/21/2020 10:58 AM, Don Brutzman wrote:
> Efi, am continuing to update specification.  Am using
>      AnalysisNodes01_07_2020.pdf
>      Report01_07_2020updated.pdf
> Some gaps on my end, please help:
> ----
> 1. Not finding description but am finding interfaces for
>      X3DAudioListenerNode
>      X3DSoundAnalysisNode
>      X3DSoundChannelNode
>      X3DSoundDestinationNode
>      X3DSoundProcessingNode
> Not finding description or interfaces for
>      AudioContext
>      BinauralListenerPoint
>      MicrophoneSource
>      VirtualMicrophoneSource
> ----
> 2. Need resolution of comments in
> 16.3.7 X3DSoundSourceNode
>    TODO do these field go here or elsewhere in hierarchy?
>    SFNode   [in,out] transform        NULL [Transform]
>    SFNode   [in,out] panner           NULL [Panner]
>    SFNode   [in,out] filter           NULL [BiquadFilter]
>    SFNode   [in,out] delay            NULL [Delay]
> ----
> 3. Inheritance questions
> In several cases you have inheritance such as
>      BiquadFilter : SoundProcessingGroup
> What does SoundProcessingGroup correspond to?
> ----
> have most interfaces added in github, please review.


4.  ListenerPoint and BinauralPoint.

a. The following fields should be SFVec3f or SFRotation for type safety.  Think about animation, we want to be able to use PositionInterpolator and OrientationInterpolator (for example) to animate these points.

   SFFloat  [in,out] positionX 0 (-∞,∞)
   SFFloat  [in,out] positionY 0 (-∞,∞)
   SFFloat  [in,out] positionZ 0 (-∞,∞)
   SFFloat  [in,out] forwardX 0 (-∞,∞)
   SFFloat  [in,out] forwardY 0 (-∞,∞)
   SFFloat  [in,out] forwardZ -1 (-∞,∞)
   SFFloat  [in,out] upX 0 (-∞,∞)
   SFFloat  [in,out] upY 1 (-∞,∞)
   SFFloat  [in,out] upZ 0 (-∞,∞)

Also note that if we are treating ListenerPoint similar to Viewpoint, we do not need to specify the upDirection vector.  Viewpoint navigation already knows "up" since that is the +Y axis for the overall scene, as used by NavigationInfo already.

Suggested interface, matching X3DViewpointNode:

   SFRotation [in,out] orientation       0 0 1 0 [-1,1],(-∞,∞)
   SFVec3f    [in,out] position          0 0 10  (-∞,∞)

b. Next.  Looking at interfaces,

BinauralListenerPoint : X3DAudioListenerNode {
ListenerPoint : X3DAudioListenerNode {
   SFBool     [in]     set_bind
   SFString   [in,out] description ""
   SFBool     [in,out] enabled     TRUE
   SFInt32    [in,out] gain        1       [0,∞)
   SFNode     [in,out] metadata    NULL [X3DMetadataObject]
   SFRotation [in,out] orientation 0 0 1 0 [-1,1],(-∞,∞)
   SFVec3f    [in,out] position    0 0 10  (-∞,∞)
   SFInt32    [in,out] gain        1       [0,∞)
# SFBool     [in,out] isViewpoint TRUE    # TODO needed?  rename?
   SFTime     [out]    bindTime
   SFBool     [out]    isBound

ListenerPoint represents the position and orientation of the person listening to the audio scene.
It provides single or multiple sound channels as output.
BinauralListenerPoint represents the position and orientation of the person listening to the audio scene, providing binaural output.

Can BinauralListenerPoint be handled equivalently by ListenerPoint? The output from this node is implicit and so no separate typing of output stream is needed.  The main difference is separation distance of two ears:

[1] Wikipedia: Binaural

[2] Wikipedia: Sound localization

To keep a separate node, we would need to define interAuralDistance value.  For specification context, I think that will be a necessary parameter for WebXR headsets.

Let's discuss.  If interAuralDistance field seems sensible, we might simply add it to ListenerPoint with default value of 0.  Does that sound OK?

   SFFloat [in out] interauralDistance  0  [0, infinity)

I think we can safely omit BinauralListenerPoint as an unnecessary node.

c. If we do not need BinauralListenerPoint then we might not need intermediate abstract interface X3DAudioListenerNode... though I recall we had some discussion of other potential future listeners.  Opinions please.

d. isViewpoint deserves discussion.  Draft prose says

"isViewpoint specifies if the listener position is the viewpoint of camera. If the isViewpoint field is FALSE, the user uses the other fields to determine the listener position."

Let's list use cases, and discuss please:

d.1. If the base functionality desired is to listen at a specific location in the scene, stationary or animated, then we're done.

d.2. If the functionality desired is simply to follow a user's view with no audio processing involved, then there is no need for ListenerPoint since Sound or SpatialSound can be used directly.

d.3. If the audio from a specific Viewpoint is needed, then the ListenerPoint might be a child field of X3DViewpointNode, just as we are now connecting NavigationInfo to viewpoints.  Such a ListenerPoint might also be animated simulatenously with a given Viewpoint, simple to do.

d.4. If the audio from a the current viewing position is desired, then we might improve clarity of this field's semantics by renaming it as "trackCurrentView".  This permits author creation of multiple ListenerPoint nodes for generating audio chains of different effects on sound received at the current user location.

Let's go with "trackCurrentView"for now - improved name candidates welcome.  Let's definitely avoid "isViewpoint" because that is confusing and seems to conflate purpose of different nodes.

   SFBool     [in,out] trackCurrentView FALSE


5.  Am completely lost by multiple entries of "Heritage from AudioNode" - did we miss an abstract node type?


6.  if  VirtualMicrophoneSource is a virtual microphone source, then isn't this the same as ListenerPoint?

Is there a different definition? Can we omit this node?


7. What are fields for MicrophoneSource?


8. SpatialSound

Similarly changed:
   SFFloat  [in,out] positionX 0 (-∞,∞)
   SFFloat  [in,out] positionY 0 (-∞,∞)
   SFFloat  [in,out] positionZ 0 (-∞,∞)
   SFFloat  [in,out] orientationX 1 (-∞,∞)
   SFFloat  [in,out] orientationY 0 (-∞,∞)
   SFFloat  [in,out] orientationZ 0 (-∞,∞)


   SFVec3f [in,out] direction  0 0 1 (-∞,∞)
   SFFloat [in,out] intensity  1     [0,1]
   SFVec3f [in,out] location   0 0 0 (-∞,∞)

matching Sound node.

Potential problem: direction vector is hard to animate... typically if changed orientation is needed, then it is placed in a parent Transform, so we probably can leave it alone.

For SpatialSound, the "gain" field should be "intensity" in order to match Sound node.

Am avoiding abbreviations.  Precise purpose of referenceDistance isn't yet clear to me.

I'm not clear about cone parameters... for now, changed degrees to radians.  Please explain further, got diagram?

If EQUAL_POWER is simple gain, is that the same as Sound node spatialize field?

Is there a way to specify HRTF, or (presumably) is that part of browser configuration?  Those might be considered Personal Identifying Information (PII) and so am in no hurry to support that; might be part of WebXR.

Perhaps HRTF should be a simple boolean, in combination with the spatialize field.  Seems simpler and sufficient.


9. Still needed, perhaps distilled from paper or Web Audio API?

	16.2.3 Sound effects processing
	Sound streams can be manipulated by a variety of sound effects...


all the best, Don
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman

More information about the x3d-public mailing list