After their baby fell asleep, Ashwini Asokan and Anand Chandrasekaran talked. Plonked on a sofa, they discussed artificial intelligence, endlessly, late into the night. The couple are fascinated by the science of machines that see – computer vision, in other words. Could this sophisticated technology be brought to smartphones, even low-end ones, and used to create a compelling experience for the mobile user? This was the dream that kept them awake through the nights in the heart of Silicon Valley.
Asokan was leading the mobile innovation team at Intel’s Interaction and Experience Research Lab (IXR). Her team of designers, anthropologists, and engineers were exploring the future of mobile technology. As for Chandrasekaran, he had just completed his post-doctoral studies in neuromorphic engineering – building computer chips that imitate the human brain – at Stanford University, and was working on neuroscience-based projects as a consultant. During his three years at Stanford, he was in the thick of artificial intelligence. He was part of the team that built Neurogrid, a system that mimics a million neurons in real-time. He also designed and simulated a chip that could mirror the plasticity of the brain – its ability to change with experience.
The couple had enough grey matter to work on their dream. It would take them a year or two to build the technology, they estimated.
At the time they were ideally located in Silicon Valley, which is the hotbed for tech startups. But it’s expensive. There was a baby and a toddler at home as well. So, after much debate, the couple decided to pack their bags, return to India, and start up in their hometown, Chennai, where they had a family support system and their savings would give them a longer lead time.
In less than a year from then, the couple’s startup, Mad Street Den, is up and running with a cloud-based platform that uses artificial intelligence to enable any smartphone with a camera to identify faces, detect facial expressions and emotions, and react to facial and head gestures. The expression acts as a trigger for a certain action – for example, frowning at the phone when an unwanted call comes in could make it shut down forthwith, or lifting an eyebrow could send the caller a message asking ‘What now?’
This image-recognizing platform, called MAD Stack, can be used by app developers and companies to create a futuristic mobile user experience. “The idea is to make machines more useful by making them a bit human: fun, intelligent, and relevant. We use computer vision to do that,” Asokan tells Tech in Asia.
The process of recognizing a human expression or responding to a gesture is simple for a human brain, but quite complex for a smartphone camera to do digitally. It is artificial intelligence that enables a camera to do this. What’s more, the app keeps getting smarter with use, through machine learning algorithms.
When machines can read your mind
Computer vision is a much-maligned technology. Science fiction, on paper and on screen, made sure that it’s as much feared as it’s glorified. From a nebulous Big Brother snooping on citizens to freaky Transformers that are aliens disguised as machines. This irks Asokan. “This technology can have countless fun uses. Why should lay people hear of it in a negative light mostly? Anand and I want to change that,” she says. “Our goal is to move the conversation around computer vision away from surveillance, security, and all the scary stuff to make it fun.”
She cites an array of applications for this technology, some of which are already out there: Amazon’s new Fire Phone can recognize barcodes, box art, or even TV audio, and help you find the object of your desire. If a phone can recognize a barcode, why not a child’s facial expression? And how might that be used in a fun game?
As for businesses, such a capability could take customer analytics to a new level. Amazon’s Firefly, billed as “visual search on steroids”, is only the beginning. Mad Street Den promises to bring similar visual search technology to any smartphone. Its object recognition feature could be a neat trick for Indian ecommerce companies like Flipkart, Snapdeal, and others competing with Amazon.
The Mad Street Den team has built its platform keeping these applications in mind.
From gaming to weeding
Technologists around the world are working on computer vision, and innovations like Amazon’s Fire Phone, Facebook’s Oculus Rift, and its Chinese rival ANTVR, the “all-in-one universal virtual reality kit” are hogging headlines. There’s even a startup, Blue River, which uses computer vision to identify weeds in organic crop fields and then selectively eliminate them. These involve expensive hardware, and for now, is limited to the few who can afford them.
In contrast, Mad Street Den’s technology is more mass market as it is software-focused and in the cloud. All it needs is to be coupled with a smartphone.
Besides potential uses in the ecommerce industry, Mad Stack can be used to build more immersive games, better social media experiences, and arm mobile analytics.
Take the popular game for kids,Talking Tom, where children can say something to a cat named Tom and he will repeat it in a silly voice. Mad Street Den can add another dimension to this game by allowing the app to see the kid’s expressions. The cat could respond by making faces in response. “Kids play with this game endlessly,” says Asokan, whose daughter stars in demo on the Mad Street Den website, where she makes all sorts of faces into a camera, which recognizes them and responds with emoticons that mimic her expressions.
MAD Stack can similarly make learning fun too. Asokan and Chandrasekaran are in talks with a developer of educational material for children, who sees many possibilities.
An eye on the competition
There is competition in the computer vision applications space already. The Eye Tribe, a Copenhagen-based startup, has a proprietary software that enables eye control of mobile devices and computers. It allows hands-free navigation of websites and apps, eye-activated login, enhanced gaming experiences, and cloud-based user engagement analytics, the company claims. But this software does need hardware components to run. Its plan is to partner with hardware makers who want to integrate these capabilities. The Samsung Galaxy S4 has an eye-tracking feature built into it, which utilizes the front-facing camera to follow a user’s eye movements. It can pause a video if you look away, resume it when you look back, scroll up and down on websites and email, and keep the phone’s display from going to sleep based on where your eyes are trained on the screen.
New York-based IMSRV is another player in this space. It has developed technology to measure human emotions using any webcam. This can be used by companies to analyze facial expressions, use this information to hone their operations, and tailor digital advertisements.
Coming back to the Mad Street Den couple, here’s the vision they had before they got to computer vision applications: Chandrasekaran used to work on the hardware side of artificial intelligence, designing brain-like chips in the Valley. But it was the worldview on artificial intelligence, or the softer side of it if you like, which preoccupied the couple during those long discussions through the night. What bothered Asokan was the slanted narrative of killer robots, cops in flying cars, and machines taking over the world. None of these predictions made decades ago in science fiction are anywhere near coming true. What about the other side of artificial intelligence, the useful side, the great applications it could have in everyday life?
Being so close to the technology, Chandrasekaran and Asokan knew what was possible. More than that, being new parents made them acutely aware of the false narrative on technology that kids are constantly bombarded with in comic books, movies, and computer games. Asokan especially wanted to change that. “I used to tell Anand, ‘It’s great that you are working on hardware. But that’s not going to change the narrative to a more positive, more real, and fun one. Beyond someone snooping on you, beyond someone scaring you, beyond someone killing you, right?”
Mad Street Den’s computer vision platform to put artificial intelligence to use in everyday life and bring joy to ordinary people is the first step in that direction. It now has an SDK for other developers to plug into. So we can look forward to a future with all sorts of delightful mind-abled devices – which is what ‘mad’ stands for.Editing by Steven Millward