[x3d-public] Geometric/Geographic Deep Learning

Thu May 23 10:29:10 PDT 2019

➢ must mirror the data 

the thing is, to see and categorize something, you figure out how to identify and document points of interest. For x3d, this is more than just an image, even though you may sample it as a 2d image. Notice the nist helloworld handwritten numeric data set as a test to recognize 32k data dimensions as the basic set of 10 simple characters. For x3d, first it is getting the author’s intent up and running. Then can come the time to accumulate data and create abstractions for various content and performance elements. 

➢ So we’re talking perturbations of a mesh, not brand new meshes.

If you wish to realtime perturbate new or used meshes, please have a look at the x3d hanim mesh deformation tools. 
First, the Displacer is a simple way to deform any mesh based on any event, rather than more primitive and verbose CoordinateIntepolator objects. 
Next, hook up a skeleton with some joints and skins and do realistic skeleton-driven animations based on weighted rotations of related joints.
Combine the two and tune up your triggers and you will see highest grade mesh animations in realtime. 

Sure, for now you are supposed to use the Humanoid container to get these features, but that is just because no one has implemented these outside of hanim in basic x3d, using names that might be more familiar to some people, like deformers or morphers, or pivots, or whatever. Also, using knowledge of how these animation tools operate gives great hints about core details of how stuff you want to see might be actually vizualized and manipulated. 

For the driver assistance tools, a question was essentially asking what is the resolution of the video used for vision synthesis. Of course the question was about using 144 of 240 or 360 or HD pixels, and of course that is a good question because that might be a measure for some sort of feature resolution or give some hint about the actual ‘size’ of the very proprietary neural network(s) in use. However, the system uses literally dozens of sensor inputs to compose a frame of the most trusted current representation of the active environment. So the science and marketing response was like: good question, please notice, it turns out that certainly the Input resolution is important, but it is the Output resolution that really counts. 

Have fun with nD meshes,
Joe

From: John Carlson
Sent: Wednesday, May 22, 2019 7:19 PM
To: Joseph D Williams; X3D Graphics public mailing list; aono at tut.jp
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

As I muse about recent articles, I conclude that the neural network must mirror the data.   That is, if you have an image, then the neural net takes all cells of an image at once.  If you have a more complex graph, then again, it takes the nodes and edges as input, not an image.   But you can convert a graph into an image.  But as was said in the video, non-euclidean geometries are pretty much a no-go for typical neural networks, which is why people have moved to GCNs and HCNs.

I think the point is, the structure of the neural network doesn’t change with each new type of geometry.  The neural network is fixed to one type of geometry.   So we’re talking perturbations of a mesh, not brand new meshes.

Thus, the brain is really good at doing vision, or doing hearing, but the vision cells will need to be retrained to hear something.

Synesthesia gets really interesting here.

John

Sent from Mail for Windows 10

From: Joseph D Williams
Sent: Wednesday, May 22, 2019 2:55 PM
To: John Carlson; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

➢ What is the appropriate type of neural network for handling geometry?

In the simple example of recognizing a 2d shape, like a letter, the decision network appears as a set of connected vertices where, for example the first level has an element that only turns on when something parses out a sequence of connected real of virtual connected basic elements that form for example, a part of a curve. Then, another element that might turn on for another part of a shape. The next decision level is somehow trained so that one of its elements turns on when it sees that one or more of the elements above it turns on, maybe when enough of the first ones are seeing enough of an arc to make a circle. Downstream, another level is trained to see that a combination of a circle and other features make a character. I see it as a form of cascade where the state of multiple set(s) of inputs produce some known reaction(s) from the network. 
That is why that tesla video says a lot me, because they went ahead and, as far as we know, produced the first computing hardware designed for this type of neural network control decision making, at least for open commerce, at this scale, not as a kit. 
Now when you get multiple inputs, even if only looking at the same thing with different viewpoints and lighting, then another level of decision making is learning how to put the various learnable elements together. 

So, for nD stuffs, to me, first it is getting whatever it is rendered so it can be analyzed and manipulated as the author intended. Now please start to look for various nD artifacts that can be used to semantically represent basic nD elements that can generally and specifically represent basic elements in the universe you are interested in. 
So in the hardware, it is the matter of having enough trainable connections to produce a competent result. 
Hardware operations and external interfaces use some sort of programming that can be trained to learn how to operate the hardware and learn how to learn from system inputs and outputs. 

➢ feed various geometries to a neural network

First you get the scene in front of you and the learning system in the form that the author intended, then you train the trainer to recognize whatever it is you want to put into it or get out of it. That is why I still believe that if you have enough points and triangles you can produce a smooth or irregular shape given whatever the rendering concept.  And, I am confident that a network can easily be trained to learn about depth. 

The tesla autonomous navigation video tried to emphasize that the training can be from simulated data or from real data, but how much better the real stuff is than the simulated stuff. The need is to advance with accumulation of data and learning enough about the data and then derive those recognizable representational aspects that provide whatever assistance you need. As expected, given the current state of the art and the final user, the dependence on the completeness of the data, the competence of the initial and ongoing trainers, and the ability of the system to learn how to effectively simulate itself using novel stimulations and responses while allowing continuous integration. . 

https://www.youtube.com/watch?v=IHZwWFHWa-w
Joe

From: John Carlson
Sent: Wednesday, May 22, 2019 9:50 AM
To: Joseph D Williams; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

We are speaking of the difference between a HyperMovie and a HyperShape, at a fundamental level.  One is appearance, the other is geometry.  What is the appropriate type of neural network for handling geometry?

Yes, I realize it’s pixels when you look at it. However, a shape is not always rectangular or flat.   This is applying neural networks to Non-Euclidean data (or so they say).

Seems like a very complex problem how to feed various geometries to a neural network. I would tend toward rectangularizing it.

So similar to how a CNN traverses through an image finding features, a GCN or HCN traverses through a graph or hypergraph finding features.   That’s the main difference.  The similarity is that they’re all convolutional.

I agree that images can be graphs and graphs can be images.

Hmm.

John

Sent from Mail for Windows 10

From: John Carlson
Sent: Wednesday, May 22, 2019 11:02 AM
To: Joseph D Williams; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

So instead of dealing with output from neural networks being voxels, maybe, just maybe, we can have graphs and meshes?  I’m not entirely clear on the distinction between the data and the network. I guess a GCN can take graphs as input?

https://arxiv.org/pdf/1903.10384.pdf

John

Sent from Mail for Windows 10

From: John Carlson
Sent: Wednesday, May 22, 2019 10:44 AM
To: Joseph D Williams; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

No, you don’t get it.  It’s not even a picture/image/frame.  It’s a graph/mesh.  Not a CNN.   A GCN or HCN.  No one said anything about moving or frames except you.

In other words, we’ve gotten past pixels in our thinking.

John

Sent from Mail for Windows 10

From: Joseph D Williams
Sent: Wednesday, May 22, 2019 10:36 AM
To: John Carlson; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

https://www.youtube.com/watch?v=aircAruvnKk

When things are moving, we can start to think of frames. If no movement, only one frame needed. 
Well, we start with the idea of a network, then thinking about how to invent a computing structure to compute the stuff.
The hardware and the training seems to be very important. That phrase of continuing integration holds the idea of a dynamically changing output result. 
Joe

From: John Carlson
Sent: Sunday, May 19, 2019 12:18 PM
To: Joseph D Williams; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

Uh, I just wanted to do geometric and geographic deep learning?

“Frame”?  https://www.youtube.com/watch?v=D3fnGG7cdjY
John

Sent from Mail for Windows 10

From: Joseph D Williams
Sent: Sunday, May 19, 2019 11:58 AM
To: John Carlson; X3D Graphics public mailing list
Subject: RE: [x3d-public] Geometric/Geographic Deep Learning

Anything you wish to discuss involving anticipation, simulation, recognition, labeling, intentionality, inclusion, exclusion, semantic and physical relationships, what the computer wants to see, deep learning, and continuous integration, then watch some of this.

https://www.youtube.com/watch?v=-b041NXGPZ8

to convolve and deconvolve is basic. How many frames you want? How many neurons you got? 

Thanks, 
Joe

From: John Carlson
Sent: Sunday, May 19, 2019 8:34 AM
To: X3D Graphics public mailing list
Subject: [x3d-public] Geometric/Geographic Deep Learning

Finally, something that interests me about deep learning!  Is anyone working on geometric or geographic deep learning?  It appears like these subfields of deep learning have emerged, based on Graph Convolution Networks (GCNs), and perhaps HyperGCNs.

Thanks,

John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://web3d.org/pipermail/x3d-public_web3d.org/attachments/20190523/3da969e1/attachment-0001.html>