Blog: How Might a Human Machine Think?
In a Kaggle competition on distracted driving, the dataset consisted of images of drivers who either were or weren’t engaging in distracting activities. Rachel Thomas warns against “…overfitting to particularities of those specific people, and not just learning the states (texting, eating, etc). (emphasis added)”
The states. This means that, in effect, the ML is pulling out the relevant concepts, just as we do. In other words, if you give it a dataset, showing a multiplicity of people texting, talking, eating, etc. while driving, then the ML will filter out the irrelevant differences and commonalities in the images and only focus on the states you’re interested in.
It’s also very much what we humans do. It doesn’t matter if the neural network is doing exactly what we’re doing, or doing it just the way we do it. What matters is that it ends up arriving at a result that we agree with: that kid is texting; that bus driver is eating a sandwich; that fine upstanding citizen has both hands on the steering wheel and is paying attention.
This is quite stunning. The network is able to extract specific concepts from the images, to ‘recognize’ distractions from non-distractions. It ignores the fact that all these images may have people in them, that they all may contain steering wheels, etc. That isn’t what we’re pointing at. We’re pointing at the states, and the computer gets that!
That is remarkable. It is actually incredible. The whole point about neural networks and deep learning is that we aren’t programming the computer. We’re just putting in a bunch of inputs and letting it run, using gradient descent, loss functions etc. as a way to ‘tune’ what it’s doing. This means that it is generalizing. It’s extracting and recognizing an idea, based on multiple instances. And this process of generalization doesn’t just stop within a ‘local’ (my word) domain. It is now starting to spread — it is figuring things out in one domain that it has learned in another via transfer learning.
Recognition of a concept requires that the instances of that concept are repeated.
Consider the following:
I’m so happy
I’m full of happiness
I’m full of joy
I’m overflowing with happiness
I’m so blessed
I’m so fortunate
I feel so good
I’m counting my blessings
I have a wonderful life
These are all, to varying degrees, expressions of the same happy idea. Happy is a good thing; it’s a positive thing.
Notice that a couple of them have the idea of fullness, which generalizes from the physical world:
I’m so full of happiness
I’m full of joy
The gas tank is full
The swimming pool is full
The glass is full
My schedule is full
What we’re doing here is looking at cross-domain meaning. And that, after all, is what meaning is: meaning is that which is generalized from one instance to another. Just as the machine is pulling out states, we pull out states. We do it effortlessly. In training a neural net how to recognize concepts, then, we need to go cross-domain. In coming posts I’ll try to lay out some ideas for how to do that.