[x3d-public] ScreenSensor, was: LineSensor

Sun Apr 7 10:05:16 PDT 2019

Leonard,

thanks for the comprehensive response. Let's see, perhaps there is a
way to replicate a screen aligned PlaneSensor with standard nodes:

Based on the HUD use case, it would be possible to use ProximitySensor
to screen align the tracking plane of a regular PlaneSensor. But then
you would not want to screen align by rotation the sibling geometries
along with the PlaneSensor, so you need to separate. You could have a
separate sensor plane, a large, invisible rectangle, and use
TouchSensor in addition in another Transform hierarchy to detect
intersections with the actual target geometry (although currently per
spec. only the nearest intersections counts, but let's say both
tracking plane and geometry can be sensed).
Then ProximitySensor could rotate the invisible sensor plane to align
with the screen, and TouchSensor's hitPoint could translate the sensor
plane to the target geometry.
But then translation_changed of the PlaneSensor will be in its, screen
aligned, coordinate system, and would need to be transformed into the
targets geometry, not screen aligned coordinate system. Use a script ?
The screen aligned sensor does not have this problem because the
tracking plane is independent of the local coordinate system.

So I still think it will be very difficult to replicate a screen
aligned PlaneSensor with standard nodes. Perhaps a challenge for
someone ?

But I think the main point is overlap with perceived overlap with 2d
HTML pointer movement.

On Sun, Apr 7, 2019 at 12:05 AM Leonard Daly <Leonard.Daly at realism.com> wrote:
>
> Andreas,
>
> Sorry for not responding sooner on this thread -- it has been a busy week for me.
>
> I am concerned about the use of a screen-aligned sensor when display 3D content in a browser. Other environments may be useful and interesting, but my focus right now is browser-based displays.
>
> A screen-aligned sensor (either line or rectangular) occupies a rectangular region (degenerate rectangular is a line) on the screen. This is the same as a HTML5 div tag that is overlayed on the screen. HTML will generate various events when the pointing device interacts with that region. These are different for cursor (mouse) and touch (finger or otherwise) [for example, there is no mouseover when there is only a touch-sensitive screen without a cursor].

For x3dom, both the overlayed div and the webgl canvas will be mouse
event targets. But I think you may start to think about X3D nodes such
as Box as actual HTML content such as div.

>
> The HTML events generated by the interaction all use screen coordinates (pixels, some of which are relative to the div origin). The X3D events are created using geometry-fractional coordinates. An X3D sensor exists in 3D space (sort-of, as I understand it) but not subject to rotation or skewing. If I am wrong, then how does a screen-aligned sensor differ from an existing TouchSensor node.

x3dom creates X3D sensor events from HTML events by identifying the
rendered pixel over which the pointer resides, and then identifying
the 3d world position of this pixel. The screen-aligned PlaneSensor
differs from TouchSensor by providing relative 3d translations which
can be used for 3d dragging but I fear I am not getting something.

> This allows the X3D screen-aligned sensor to change in size as the camera changes position.

The tracking plane is infinite, unless it is restricted by
min/maxPosition. Otherwise the sibling geometries control the size of
the sensed surface. The 3d orientation of the tracking plane changes
as the camera changes.

> This feature would need to be programmed into an HTML div for the same effect.So the X3D screen-aligned sensor exists in 3D space that is displayed on the canvas, when the HTML div exists in an HTML layer relative (in some manner) to the document and may be in front of or behind the canvas.

>
> In any case, there are two constructs that take the same sort of user interaction, but produce significantly different results.

So one construct is the HTML layer, and the other construct is the
screen aligned sensor. You can drag in both parallel to the screen. In
the HTML layer you expect 2d translations, with the sensor you expect
3d translations.

> That will lead to confusion on the part of X3D developers and perhaps users.

Where is the confusion ? Developers will understand what drag sensors
are and users will find a screen aligned PlaneSensor more intuitive
than a fixed plane PlaneSensor.

Perhaps the conflict is about an envisioned HTML mouse event that
automatically works in 3d. So in addition to the 2d mouse pixel
position, the event payload would include 3d position and a mousemove
event would include a 3d translation value, calculated in a screen
parallel 3d plane ? Such a future mouse event would obviate the need
for the sensors.

> It is not clear what the official solution will be.The Wiki page at http://www.web3d.org/wiki/index.php/X3D_version_4.0_Development discusses X3D/DOM integration and many issues related to that. It appears that there is no definitive explicit statement as to capabilities on that page with specific regard to DOM integration.

I like my x-ite dom bridge as a prototype for x3d compatible dom integration.

> Consider the experience of a virtual world (to fit into X3D V4.0, not V4.1) that runs in a browser. There is content external to the 3D display region that needs to be updated from the 3D content and vice-versa. The experience has a head's up display containing important information and features about the experience. It is generally easier (for the developer) and produces a cleaner looking interface if the HUD is done in 2D; however, it could be done in 3D. One of the features is a 2D picker (aka TouchSensor). There are now (with the screen-aligned sensor) two different means for interacting with the user. Each do things a little differently. As a developer I would wonder about which to use:
>
> HTML5 div (easy to fix to the display)
> X3D TouchSensor (because its attached to the HUD that is fixed in the display)
> X3D screen-aligned sensor (because its better somehow?)

HUD is not be the intended use case for screen-aligned sensor. The use
case is manipulating the 3d position of objects in the world by
dragging them. It is an alternative to a transform widget. HTML5 div
and TouchSensor cannot be considered potential solutions for this use
case as far as I can see.

> As a user I might wonder why the interaction does not support multi-touch. As a maintainer, I would need to figure out which sensor/events were source/primary so I can perform updates, etc.

Yeah, multitouch should be considered for PlaneSensor. It would allow
for scaling and rotating in addition to translating. This applies for
regular PlaneSensor as well. Multitouch will have the added complexity
that it will be necessary to distinguish between multiple pointing
devices/hands with single fingers and a single device/hand with
multiple touchpoints. These interactions have to programmed for X3D or
HTML.

>
> I am having a very hard time figuring out what this sensor provides that is not available though other means. For example, doesn't a TouchSensor on a Billboard (with the right field values) stay screen aligned?

The difference is that with the sensor you normally would not want its
sibling geometries to become also screen aligned. Usually you only
want the siblings to be translated within the screen plane, not
rotated into the screen plane, as a Billboard would do.

TouchSensor by itself may not be suitable for dragging. I thought that
is why there are drag sensor and in html mouse/touchmove events.

Let's see if we can use a TouchSensor for dragging.

Starting well, the initial touch/click generates isActive which can be
picked up to store the initial hitPoint.
Then the pointer moves. We do not care if the pointer is still over
the object as long as the primary button is still pressed. The pointer
may move fast and may have been moved beyond the object before its
position was updated. isActive is still true but we do net get any
hitPoint updates anymore, so we do not know where the pointer is. This
sounds like a problem.
Let's say the pointer moves slowly and we do still get hitPoint updates.
We could then calculate the difference to the initial hitPoint using a
script and move the geometry with that relative translation.
However, the relative translation now depends on the shape of the
geometry since the second hitPoint was located somewhere away from the
initial hitPoint. This would not work well for dragging a sphere, for
example.
So we need a defined plane to constrain the translation to. You could
pick the XY plane of the local coordinate system. Now you need to
project the second hitPoint to the XY plane in a script. You will end
up with reimplementing PlaneSensor but it would be possible.

So I do not see how TouchSensor can be used as a PlaneSensor without
actually reimplementing PlaneSensor.

Let's see how a-frame deals with dragging, perhaps that provides some
insight into declarative dragging.

-Andreas

> Hi Leonard,
>
> I kept puzzling over your comment. The main function of the
> dragsensors is to translate from pixel/screen units to world
> units/coordinate systems which requires some kind of 2d to 3d
> translation. Why would you not count this difference ? This
> translation is quite difficult to do and benefits from a consistency
> across scenes and systems. So screen-aligned PlaneSensor is not about
> the event system and in fact does not introduce any new events. So I
> am not sure how the question about events applies ? Is about the
> DragSensors in general ?
>
> -Andreas
>
> On Thu, Apr 4, 2019 at 9:59 AM Leonard Daly <Leonard.Daly at realism.com> wrote:
>
> [Removed all but the most recent post in thread. Previous post at http://web3d.org/mailman/private/x3d-public_web3d.org/2019-April/010460.html]
>
> Andreas,
>
> Reading over the explainer (github link), this looks a lot like an HTML mousedown/mousemove or touchstart/touchmove set of events. So (when running in a browser), is there any substantial difference between the two? I am not counting differences in the output coordinate system (pixels vs. scaled units). If no, then why introduce a new node and events for something that already exists. If yes, please explain what problem this solves that HTML's event system does not.
>
> Independent of the above, how would this work in an stereo immersive environment (headset)?
>
> Leonard Daly
>
>
>
> I am working on a spec. comment on screen aligned PlaneSensor
> functionality (aka PointSensor in freeWrl or SpaceSensor in Instant)
> and developed an example, a description and suggested spec. language
> here:
>
> https://bl.ocks.org/andreasplesch/f196e98c86bc9dc9686a7e5b4acede7d
> https://gist.github.com/andreasplesch/f196e98c86bc9dc9686a7e5b4acede7d
>
> This is an important but currently not available feature which cannot
> be implemented with a Proto, at least I cannot think of a way to do.
>
> Any feedback will be welcome. Actually, I have a couple comments
> myself in a follow-up message.
>
> -Andreas
>
>
>
> --
> Leonard Daly
> 3D Systems & Cloud Consultant
> LA ACM SIGGRAPH Past Chair
> President, Daly Realism - Creating the Future
>
>
>
> --
> Leonard Daly
> 3D Systems & Cloud Consultant
> LA ACM SIGGRAPH Past Chair
> President, Daly Realism - Creating the Future

-- 
Andreas Plesch
Waltham, MA 02453