Blog: The AR Paradigm Shift, Machine Learning, and Smart Objects for Your Face
Google made a valiant push for its Smart Glasses (shown above) starting in 2013 and “went back to the lab” by 2015 due to a lack of market traction, among other issues. As we all know, it’s rarely the first concept of a product that goes mainstream. The iPod was preceded by countless deprecated mp3-players. Prior to the current state of navigation, handheld GPS devices were common, but Smartphones eliminated the need for separate clunky devices. Thus, we cannot expect the first version of a product to have a high likelihood of success.
Augmented Reality (AR) has come a long way since 2015. The most obvious platform for widespread AR success is the Mobile phone. Most people are more likely to forget their keys or wallet than their Smartphone. Again, what device has replaced most other devices and has the widest variety of use cases? Smartphones. In a world top-heavy with developers, Google and Apple are taking full advantage of the Mobile AR scene with their respective developer environments, ARCore and ARKit.
Glasses are the other obvious platform for AR success. As a data geek, I love the idea of maintaining my nerdy look while having environment statistics, historical facts, and augmented virtual toys projected to my retinas. Following in the footsteps of Google Glass, Microsoft is investing heavily in its second generation AR Glasses, HoloLens2, mostly aimed at simulating design in construction, medical, and other “hands-on” industries. There are a multitude of other players, but most anticipated is Apple, who is targeting consumers (as opposed to industries). Aesthetics and functionality will be the arbiters of Apple’s success. After all, does this artistic rendering (from Apple’s patent application) imbue a sense of “cool”?
All of the compute power will come from the iPhone, which will hopefully keep the glasses slim but will alter the structure of typical machine learning algorithms. Analysts project Apple to commence mass production of its AR glasses by second quarter of 2020, possibly by Q4 2019.
Machine Learning and AR
For the case where the bulk of compute power comes from the user’s Smartphone, machine learning algorithms and model training periods will need to be compact. For AR devices that learn from a user’s surroundings, Convolutional Neural Networks (CNN’s) aid in specifying which objects and experiences are adapted to the user. Given that most current image classification algorithms are trained on GPU’s, the wide adoption of AR will prompt slimmer, “quick-win” algorithms that require less compute power. Additionally, this impetus will prompt an increased demand for beefier processing chips for Smartphones, which should further decouple the human condition from traditional computer interaction and further towards the Mobile experience. Companies like Xnor.ai are building lighter-weight AI algorithms for “edge devices”. In this paper, Xnor.ai proposes two methodologies for more efficient CNN’s.
Binary-Weight-Networks are exactly as they sound — all the weight values are approximated with binary values (instead of costly real-values). Next, they propose “XNOR-Networks where both the weights and the inputs to the convolutional and fully connected layers are approximated with binary values.” Since “all of the operands of the convolutions are binary, then the convolutions can be estimated by XNOR and bitcounting operations.” Shown in the image above, the former approach shows ~2x computation saving, while the later approach yields ~58x computation saving! These deep neural networks are built to run on less compute power and more specifically, to run on CPU’s (think Mobile).
There is another lane for machine learning that is destined to greatly influence augmented reality applications. Swim recently adopted an Apache license for its Open Source Edge Intelligence Platform.
This platform is revolutionizing data streaming by using machine learning models to create more efficient streaming applications (it’s like ML-ception). The company calls their ML layer a Digital Twin. Swim was tasked to collect traffic data in the city of Palo Alto. Across the city, “the intersections collaborate to predict the future flow of traffic about two to five minutes ahead of time.” You can demo the UI visualization here. The application build is ~2MB and can be installed on IoT sensors and edge devices. Due to the open source nature of Swim, plan on this project enticing developers to build further improvements within machine learning for AR.
The Next Shift
There will be many more AR advances around smart sensors, such as brain-computer-interfaces (BCI’s) or more generally human-machine-interfaces (HMI’s). Founded by Saku Panditharatne, a great example of HMI for design is Asteroid, a MacOS design software tool that helps developers build apps using AR, ML, voice, gesture, and eye-tracking. The company’s bundle will include an multiple sensors and a brain-computer interface. “The process of putting your thoughts into a computer may be 10x or even 100x faster than with just a mouse.”
In Saku’s blog, sensor types are outlined with corresponding machine learning algorithm classes and display types:
What’s next? BCI sensors could register amygdaloid responses to stressful thoughts or situations, prompting your glasses or device to attempt to remedy the stress-response. These sensors, along with machine learning, would build out evolving behavioral recommendations and could be the backbone for near-real-time cognitive behavioral therapy (CBT). How many patient-doctor dollars and hours could be foregone via AR CBT (#ARCBT #trending)? I’m not arguing for a dismemberment of the Psychiatry profession, but rather a re-channeling towards more efficient technology-enabled solutions. I digress.
Keep an eye out for widespread reality augmentation in the near future. Apple could change the AR consumer-game, but there are plenty of other opportunities for advances around different types of human-machine-interfaces in relation to augmented and mixed reality!