[x3d-public] X3D Specification Editors: Audio and Sound, more progress and issues to resolve, minutes 23 July 2020

Thu Jul 23 22:44:38 PDT 2020

Sound editors weekly meeting.  Attendees Efi Lakka, Thanos Malamos, Dick Puk, Don Brutzman.

We first went through Notes22_07_020.pdf (attached) which answers multiple pending questions.

We agreed to add attached image Sound_Propagation_Phenomena.png as Figure 16.2 in section 16.2.2 Sound attenuation and spatialization.  The next figure will get renumbered as 16.3.

We next added Efi's new section 16.2.3 Sound propagation.  Some further editorial smoothing was applied.

========================
16.2.3 Sound propagation

Sound-propagation techniques can be used to simulate sound waves as they travel from each source to scene listening points by taking into account the expected interactions with various objects in the scene. In other words, spatial sound rendering includes the estimation of physical effects involved in sound propagation such as surface reflection (specular, diffusion) and wave phenomena (refraction, diffraction) within a 3D scene. Figure 16.2 provides an overview of the physical models of sound propagation that are considered.
========================

We next added corrections to Shape component for Appearance acousticProperties field.

We refined the phrasing for diffuse reflection and specular reflection etc. in AcousticProperties.  Provided here for review.

=========================
12.4.1 AcousticProperties

AcousticProperties : X3DAppearanceChildNode  {
    SFFloat [in,out] absorption 0    [0,1]
    SFFloat [in,out] diffuse    0    [0,1]
    SFNode  [in,out] metadata   NULL [X3DMetadataObject]
    SFFloat [in,out] refraction 0    [0,1]
    SFFloat [in,out] specular   0    [0,1]
}

The AcousticProperties node determines acoustic effects including surface-related physical phenomena such as specular reflection, diffuse reflection, absorption, and refraction coefficients of materials. These coefficient values are expected to fully account for physical and structural characteristics of the associated geometry such as width, height, thickness, shape, softness and/or hardness, and density variations.

The absorption field specifies the sound absorption coefficient of a surface which is the ratio of the sound intensity absorbed or otherwise not reflected by a specific surface that of the initial sound intensity. This characteristic depends on the nature and thickness of the material. Sound energy is partially absorbed when it encounters fibrous or porous materials, panels that have some flexibility, volumes of air that resonate, and openings in room boundaries (e.g. doorways). Moreover, the absorption of sound by a particular shape depends on the angle of incidence and frequency of the sound wave.

The diffuse field describes the diffuse coefficient of sound reflection. This is one of the physical phenomena of sound that occurs when a sound wave strikes a plane surface, and part of the sound energy is reflected back into space in multiple directions.

The refraction field describes the sound refraction coefficient of a medium, which determines the change in propagation direction of a sound wave when it obliquely crosses the boundary between two mediums where its speed is different. These relationships are described by Snell's Law.

The specular field describes the specular coefficient of sound reflection, which is one of the physical phenomena of sound that occurs when a sound wave strikes a plane surface. Part of the sound energy is directly reflected back into space, where the angle of reflection is equal to the angle of incidence.
=========================

On 7/22/2020 6:46 AM, Eftychia Lakka wrote:
>> 
> Dear all,
> 
> please find in the Notes22_07_2020.doc the most of the answers in your questions.
> Also, I update the *shape.html*--> refraction field from AcousticProperties node and the *sound.html* --> "16.2.3 Sound propagation" from X3D spec (from github).

reviewed and applied these changes applied to github documents.

> Finally, I attach again AnalysisNode and Report (last versions).
> 
> If you cannot download the file, please find it here <https://www.dropbox.com/sh/x6fe3v8hxu179z7/AACKxJGETEaqRzsUmC-1HX0Ia?dl=0>.
> 
> I am focusing on paper, so after Monday (27/07 deadline for the submission), I will review all the interfaces from the X3D spec (github).

super!!  agreed with your priorities.

> Best regards,
> Efi Lakka
> 
> Στις Τετ, 22 Ιουλ 2020 στις 8:55 π.μ., ο/η Don Brutzman <brutzman at nps.edu <mailto:brutzman at nps.edu>> έγραψε:
> 
>     Some more comments for group resolution.  Most nodes have definitions and interfaces in github.
> 
>     We still have a number of gaps.  If we get the biggest sorted out, and have up-to-date field definitions, then am hoping we have a sufficiently mature "work in progress" (backed up by Web Audio API CR) for draft publication.
> 
>     Look below for 4..8, kept in order for simplest review and update.  See you Wednesday.
> 
>     -----
> 
>     On 7/21/2020 10:58 AM, Don Brutzman wrote:
>      > Efi, am continuing to update specification.  Am using
>      >
>      >      AnalysisNodes01_07_2020.pdf
>      >      Report01_07_2020updated.pdf

will continue to use latest documents, thanks for these updated attached versions.  i applied file compression wherever possible.

>      > Some gaps on my end, please help:
>      >
>      > ----
>      >
>      > 1. Not finding description but am finding interfaces for
>      >
>      >      X3DAudioListenerNode
>      >      X3DSoundAnalysisNode
>      >      X3DSoundChannelNode
>      >      X3DSoundDestinationNode
>      >      X3DSoundProcessingNode

The interfaces look OK.  We need the simple prose descriptions.

>      > Not finding description or interfaces for
>      >
>      >      AudioContext
>      >      BinauralListenerPoint
>      >      MicrophoneSource
>      >      VirtualMicrophoneSource

     AudioContext - It is deprecated

OK I have removed it.

     BinauralListenerPoint - It is not different node, it is included in ListenerPoint
     MicrophoneSource - check the Report22_07_2020 (attached)
     VirtualMicrophoneSource - Is it different from ListenerPoint?? We didn’t decide it yet

more to follow on each of these...

>      > ----
>      >
>      > 2. Need resolution of comments in
>      >
>      > 16.3.7 X3DSoundSourceNode
>      >
>      >    TODO do these field go here or elsewhere in hierarchy?
>      >    SFNode   [in,out] transform        NULL [Transform]
>      >    SFNode   [in,out] panner           NULL [Panner]
>      >    SFNode   [in,out] filter           NULL [BiquadFilter]
>      >    SFNode   [in,out] delay            NULL [Delay]

agreed to remove, that TODO block is now gone.

>      > ----
>      >
>      > 3. Inheritance questions
>      >
>      > In several cases you have inheritance such as
>      >
>      >      BiquadFilter : SoundProcessingGroup
>      >
>      > What does SoundProcessingGroup correspond to?

Efi has provided an answer: SoundProcessingGroup abstract node has been replaced by X3DSoundProcessingNode.

Correction applied to sound.html

>      > ----
>      >
>      > have most interfaces added in github, please review.
> 
>     -----
> 
>     4.  ListenerPoint and BinauralPoint.
> 
>     a. The following fields should be SFVec3f or SFRotation for type safety.  Think about animation, we want to be able to use PositionInterpolator and OrientationInterpolator (for example) to animate these points.
> 
>         SFFloat  [in,out] positionX 0 (-∞,∞)
>         SFFloat  [in,out] positionY 0 (-∞,∞)
>         SFFloat  [in,out] positionZ 0 (-∞,∞)
>         SFFloat  [in,out] forwardX 0 (-∞,∞)
>         SFFloat  [in,out] forwardY 0 (-∞,∞)
>         SFFloat  [in,out] forwardZ -1 (-∞,∞)
>         SFFloat  [in,out] upX 0 (-∞,∞)
>         SFFloat  [in,out] upY 1 (-∞,∞)
>         SFFloat  [in,out] upZ 0 (-∞,∞)
> 
>     Also note that if we are treating ListenerPoint similar to Viewpoint, we do not need to specify the upDirection vector.  Viewpoint navigation already knows "up" since that is the +Y axis for the overall scene, as used by NavigationInfo already.
> 
>     Suggested interface, matching X3DViewpointNode:
> 
>         SFRotation [in,out] orientation       0 0 1 0 [-1,1],(-∞,∞)
>         SFVec3f    [in,out] position          0 0 10  (-∞,∞)

accepted

>     b. Next.  Looking at interfaces,
> 
>     ==============================================
>     BinauralListenerPoint : X3DAudioListenerNode {
>         or
>     ListenerPoint : X3DAudioListenerNode {
>         SFBool     [in]     set_bind
>         SFString   [in,out] description ""
>         SFBool     [in,out] enabled     TRUE
>         SFInt32    [in,out] gain        1       [0,∞)
>         SFNode     [in,out] metadata    NULL [X3DMetadataObject]
>         SFRotation [in,out] orientation 0 0 1 0 [-1,1],(-∞,∞)
>         SFVec3f    [in,out] position    0 0 10  (-∞,∞)
>         SFInt32    [in,out] gain        1       [0,∞)
>     # SFBool     [in,out] isViewpoint TRUE    # TODO needed?  rename?
>         SFTime     [out]    bindTime
>         SFBool     [out]    isBound
>     }
> 
>     ListenerPoint represents the position and orientation of the person listening to the audio scene.
>     It provides single or multiple sound channels as output.
>         or
>     BinauralListenerPoint represents the position and orientation of the person listening to the audio scene, providing binaural output.
>     ==============================================
> 
>     Can BinauralListenerPoint be handled equivalently by ListenerPoint? The output from this node is implicit and so no separate typing of output stream is needed.  The main difference is separation distance of two ears:
> 
>     [1] Wikipedia: Binaural
>     https://en.wikipedia.org/wiki/Binaural
> 
>     [2] Wikipedia: Sound localization
>     https://en.wikipedia.org/wiki/Sound_localization
> 
>     To keep a separate node, we would need to define interauralDistancbrutze value.  For specification context, I think that will be a necessary parameter for WebXR headsets.
> 
>     Let's discuss.  If interauralDistance field seems sensible, we might simply add it to ListenerPoint with default value of 0.  Does that sound OK?
> 
>         SFFloat [in out] interauralDistance  0  [0, infinity)
> 
>     I think we can safely omit BinauralListenerPoint as an unnecessary node.

accepted

16.4.6 BinauralListenerPoint has been removed.

>     c. If we do not need BinauralListenerPoint then we might not need intermediate abstract interface X3DAudioListenerNode... though I recall we had some discussion of other potential future listeners.  Opinions please.
16.3.1 X3DAudioListenerNode will also be removed, since we do not currently see any expected future value and so it is a needless complication.

Changes applied.

>     d. isViewpoint deserves discussion.  Draft prose says
> 
>     "isViewpoint specifies if the listener position is the viewpoint of camera. If the isViewpoint field is FALSE, the user uses the other fields to determine the listener position."
> 
>     Let's list use cases, and discuss please:
> 
>     d.1. If the base functionality desired is to listen at a specific location in the scene, stationary or animated, then we're done.
> 
>     d.2. If the functionality desired is simply to follow a user's view with no audio processing involved, then there is no need for ListenerPoint since Sound or SpatialSound can be used directly.

... to produce sound, and listening from current view is default behavior already in VRML/X3D.  So this use case can be supported if no ListenerPoint is bound and connected.

>     d.3. If the audio from a specific Viewpoint is needed, then the ListenerPoint might be a child field of X3DViewpointNode, just as we are now connecting NavigationInfo to viewpoints.  Such a ListenerPoint might also be animated simultaneously with a given Viewpoint, simple to do.

if sensible will consider this convenience field in this draft; it is available to authors already by simply binding Viewpoint and ListenerPoint simultaneously at same location in scene graph transformation hierarchy.  Note that such a ListenerPoint definitions include exact same spatial offset as Viewpoint predefined position.

>     d.4. If the audio from a the current viewing position is desired, then we might improve clarity of this field's semantics by renaming it as "trackCurrentView".  This permits author creation of multiple ListenerPoint nodes for generating audio chains of different effects on sound received at the current user location.

this is the tricky case that took a lot of discussion.

>     Let's go with "trackCurrentView"for now - improved name candidates welcome.  Let's definitely avoid "isViewpoint" because that is confusing and seems to conflate purpose of different nodes.
> 
>         SFBool     [in,out] trackCurrentView FALSE

We will try to use this simple name for supporting this use case, i.e. tracking long with the user's current point of view, and avoid intermingling ListenerPoint functionality with Viewpoint node.

Implementation and evaluation of examples will be important for refining and confirming proper support for this use case.

>     ----
> 
>     5.  Am completely lost by multiple entries of "Heritage from AudioNode" - did we miss an abstract node type?

Efi reports "NO, it is not a new X3D node. It is referred to the corresponding AudioNode from Web Audio API, but the attributes of this should be included in X3D."

Aha.  So heritage is appropriate, not inheritance, "AudioNode" refers to W3C Audio API.  Updated/disambiguated comment in interfaces:

	<!-- Related to W3C Audio API AudioNode -->

[1] Web Audio API
      W3C Candidate Recommendation, 11 June 2020
      https://www.w3.org/TR/webaudio

      1.5. The AudioNode Interface
      https://www.w3.org/TR/webaudio/#audionode

Therefore we integrate those fields in existing X3D nodes as appropriate.  If any go into existing X3D abstract node types, such refactoring will become evident after repeated review and implementation.

AudioNode confirmed not present in X3D4 draft specification.

>     ----
> 
>     6.  if  VirtualMicrophoneSource is a virtual microphone source, then isn't this the same as ListenerPoint?
> 
>     Is there a different definition? Can we omit this node?

agreed, comment accepted, we will omit VirtualMicrophoneSource... now removed.

new question, can be multiple ListenerPoints be active?  In other words, they are listening even if not bound, in order to feed audio graphs and media streams?

agreed that this makes sense and seems appropriate, multiple audio channels might be working simultaneously, will work on prose.  how about

"Multiple ListenerPoint nodes can be active for sound processing, but only one can be bound as the active listening point for the user."

>     ----
> 
>     7. What are fields for MicrophoneSource?

Efi has provided an initial draft, will add, also including enabled field.  Looks like a start.

MicrophoneSource : X3DSoundSourceNode {
    SFString [in,out] description    ""
    SFBool   [in,out] enabled            TRUE
    SFBool   [in,out] isActive FALSE
    SFString [in,out] mediaDevicesid ""
    MFNode   [in,out] audioGraph [] [X3DChildNode]
}

>     ----
> 
>     8. SpatialSound
> 
>     Similarly changed:
>         SFFloat  [in,out] positionX 0 (-∞,∞)
>         SFFloat  [in,out] positionY 0 (-∞,∞)
>         SFFloat  [in,out] positionZ 0 (-∞,∞)
>         SFFloat  [in,out] orientationX 1 (-∞,∞)
>         SFFloat  [in,out] orientationY 0 (-∞,∞)
>         SFFloat  [in,out] orientationZ 0 (-∞,∞)
> 
>     to
> 
>         SFVec3f [in,out] direction  0 0 1 (-∞,∞)
>         SFFloat [in,out] intensity  1     [0,1]
>         SFVec3f [in,out] location   0 0 0 (-∞,∞)
> 
>     matching Sound node.
> 
>     Potential problem: direction vector is hard to animate... typically if changed orientation is needed, then it is placed in a parent Transform, so we probably can leave it alone.

Efi reports "For me it seems ok"...

I think the point here is not that it is incorrect/infeasible, but rather we should keep the patterns of point of listening as similar to point of view as possible, in order to keep definition/animation simple for authors and consistent throughout.

Consistent simple design is good whenever we can achieve it.

>     For SpatialSound, the "gain" field should be "intensity" in order to match Sound node.
> 
>     Am avoiding abbreviations.  Precise purpose of referenceDistance isn't yet clear to me.

TODO discussion.

>     I'm not clear about cone parameters... for now, changed degrees to radians.  Please explain further, got diagram?

still TODO, we didn't have time for that discussion.

>     If EQUAL_POWER is simple gain, is that the same as Sound node spatialize field?
> 
>     Is there a way to specify HRTF, or (presumably) is that part of browser configuration?  Those might be considered Personal Identifying Information (PII) and so am in no hurry to support that; might be part of WebXR.
> 
>     Perhaps HRTF should be a simple boolean, in combination with the spatialize field.  Seems simpler and sufficient.

initial reactions welcome.  recommendation: try simple boolean for HRTF and see if everything is workable...  how about

    SFBool   [in,out] enableHRTF       FALSE

We agreed to try a simple boolean to keep things simple at this stage if possible.  No intentions to provide a path for HRTF coefficients or rendering, that is a big separate deal and up to browser.  Probably satisfactory for X3D 4.0 to encourage implementation and experimentation in a somewhat consistent manner.

>     ----
> 
>     9. Still needed, perhaps distilled from paper or Web Audio API?
> 
>              16.2.3 Sound effects processing
>              Sound streams can be manipulated by a variety of sound effects...

the preceding two questions are for followup discussion, meeting again next week.

Shared updated documents attached.  (Efi, I couldn't get AnalysisNodes22_07_2020.html to display, and so not attached)

Reviewed Report22_07_2020.pdf against 4 Concepts clause, 4.4.2.3 Interface hierarchy.  Removed BinauralListenerPoint, X3DAudioListenerNode, VirtualMicrophoneSource.

TODO: Gain node, removed or restored?  We can have an SFFloat field when needed, that seems simplest for animation and control.  Correct?  If so then Efi you should remove Gain interface from your hierarchy in the Report document.  (Otherwise we need to add this node interface and definition to X3D draft.)

Gain : SoundProcessingGroup {
    SFInt32 [in,out] gain 1 [0,∞)
    <!-- common fields from W3C Audio API interface AudioNode -->
    SFInt32 [in,out] numberOfInputs 0 [0,∞)
    SFInt32 [in,out] numberOfOutputs 0 [0,∞)
    SFInt32 [in,out] channelCount 0 [0,∞)
    SFString[in,out] channelCountMode "max"
    SFString[in,out] channelInterpretation "speakers"
}

TODO: discuss node connections to building an audio graph.  First step found in documentation, not yet in github: "Insert

	MFNode [in, out] audioGraph [] [X3DChildNode]
	in
	(AudioClip, AudioBufferSource, OscilattorSource, StreamAudioSource, MicrophoneSource)"

Next: paper take priority (Efi will share with co-authors), we will work on both paper and spec.

Thanos and Efi thinks that prior demos are still consistent and possible for upcoming Web3D Webinar.

I now feel a lot more comfortable that we are converging on partial draft of Sound component that is sufficient for public release, review and comment, encouraging implementation and evaluation of X3D examples demonstrating W3D Audio API.  New frontiers becoming ready for exploration and physically based auralization...

Thanks for steady impressive work, continuing.  Have fun with X3D sound and audio!  8)

all the best, Don
-- 
Don Brutzman  Naval Postgraduate School, Code USW/Br       brutzman at nps.edu
Watkins 270,  MOVES Institute, Monterey CA 93943-5000 USA   +1.831.656.2149
X3D graphics, virtual worlds, navy robotics http://faculty.nps.edu/brutzman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Sound_Propagation_Phenomena.png
Type: image/png
Size: 403738 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20200723/84c3ffe7/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Notes22_07_2020.pdf
Type: application/pdf
Size: 264781 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20200723/84c3ffe7/attachment-0003.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Report22_07_2020.pdf
Type: application/pdf
Size: 204991 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20200723/84c3ffe7/attachment-0004.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AnalysisNodes22_07_2020.pdf
Type: application/pdf
Size: 388121 bytes
Desc: not available
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20200723/84c3ffe7/attachment-0005.pdf>