Technology
Karan Kamble
Jan 31, 2024, 11:25 AM | Updated 11:25 AM IST
Save & read from anywhere!
Bookmark stories for easy access on any device or the Swarajya app.
We’ve grown tired of smartphones, declared Jesse Lyu, the founder and chief executive of Rabbit, during the launch of his company’s remarkable artificial intelligence (AI)-powered talking device called the r1. (Read our story about it.)
While most consumers may not immediately relate to the idea that we must go beyond the smartphone, given that it’s all we know presently and swear by day-in and day-out, the power of the alternative is undeniable, especially since the source of that power has taken the internet world by storm over the last year.
This ‘source’ is generative AI and specifically the large language model (LLM), which enables the viral AI chatbot sensation ChatGPT and others like it to answer questions and have a free-flowing conversation as if it were an actual human assistant.
It’s human-machine interaction at a level that we have never seen before on a wide scale — and it’s unfolding at a time when smartphone sales are falling and AI is the top draw for venture capital.
This novel human-machine interaction enabled by generative AI is presently limited to particular websites or mobile applications (apps), which in turn are chained to the smartphone. Imagine the pervasive power of a ChatGPT if it was unshackled from the laptop or the smartphone and channelled through, say, an exclusive AI-powered device operating across modes of text, voice, and visual.
That is roughly the attempt of some smartphone-wary technology companies. The Rabbit r1, for instance, represents a new generation of devices, quite different from the all-too-familiar smartphone, in that you push a button on the side of the device to ask it questions in search of answers. It’s an AI walkie-talkie.
However, the r1 is not simply a talking ChatGPT powered by an LLM. Rabbit’s highlight is the large action model (LAM), which, after extensive training on all sorts of user interfaces, is able to accomplish tasks across apps and services.
While an LLM speaks with you and answers your queries, LAM gets things done for you. (The r1 uses an LLM, too, that of San Francisco-based startup Perplexity.)
The promise of the Rabbit OS operating system is that it hears a human voice input in natural language, makes sense of what is really being asked, and translates it into actionable steps or responses in accordance with the interaction.
So, it can execute a relatively complex combination of tasks as advertised by the company: “Order me an Uber and find me a good podcast to pass the time… oh, and tell everybody that I might be late.”
Such an “instruction” on the smartphone would translate to three separate tasks, each comprising multiple subtasks — opening up the Uber app and going through various steps to book a ride, then run a web search for good podcasts or a visual scan of podcasts on one or more podcast apps or web services, and finally open up a messaging service to individually send out messages to specific people telling them you’ll be late.
Compare the number of taps and swipes on the screen necessary to accomplish such a mundane operation on the smartphone with simply telling your device what to do and for it to follow through as instructed. It’s pretty evident, therefore, that the r1 represents a new way of doing things — and a direct threat to the smartphone over time.
According to Lyu, who passionately advocates transcending the current smartphone paradigm of the app-based operating system model, the r1 is not a smartphone replacement; rather, it is meant to be a companion device to the smartphone.
However, the r1 needn’t remain an add-on. If the $199-priced r1 gains wide acceptance — the early sales suggest so — and if smartphone “advances” continue to be as paltry as the repeated additions of a camera and almost imperceptibly higher processing speeds, Rabbit could easily develop and offer premium versions that make the smartphone redundant.
But the movement away from the smartphone isn’t centred on the r1 alone. Rabbit’s AI gadget is now in the limelight due to its recent launch early in January 2024; however, another company made a device offering late last year that’s far more radically different from the smartphone as compared to the r1.
In November 2023, Humane unveiled its “Ai Pin” — a standalone, wearable device and software platform powered by AI. It is meant to be worn on the shirt or jacket, around the chest, so it can follow you wherever you go. It constantly learns about you and your surroundings depending on your interactions with it.
One engages with this AI device through touch, voice, gesture, or the screenless screen by way of the laser ink display. It is equipped with an ultra-wide RGB camera, depth sensor, motion sensors, and a speaker.
Using touch controls and voice, one can use the Ai Pin to ask questions, make a phone call, send and search through messages, play music, take pictures and videos, interpret different languages in voice, and so much more. Doing away with the screen, the Ai Pin instead projects a display on to your palm when you bring it in line with the device. Subsequent hand gestures help control the device, such as the playback or volume controls for the music being played.
Notably, using computer vision, the Ai Pin can recognise visual objects. This feature is helpful in, for instance, supporting someone’s health and nutrition goals, with the AI capable of telling you, when prompted, the nutritional value of the food that it sees. It can help make online purchases by voice, eliminating the need to navigate a shopping app through numerous taps and swipes on the smartphone screen. All the user data generated, including photos, videos, and notes, is controlled via a central hub called “Humane.Center.”
Like the r1, the Ai Pin moves away from app-based operation. “We don’t do apps. Humane’s OS runs AI experiences that are on device and in the cloud. The OS understands what you need and picks the right AI in the moment,” Humane co-founder Bethany Bongiorno said in the video introducing the breakthrough device. Humane’s “operating system for the AI era” is called Cosmos.
The entire Ai Pin system is priced at $699. With a monthly fee of $24 for the data and voice carrier plan, the user can tap into Humane’s growing network of services.
“It is our aim at Humane to build for the world not as it exists today but as it could be tomorrow — one where we can take the full power of AI everywhere and have it weave seamlessly into our everyday lives,” said co-founder Imran Chaudhri in the Ai Pin introduction video.
Devices like the Humane Ai Pin and the Rabbit r1 are certainly devices of the future, but they aren’t without their faults — for instance, must one always talk aloud to operate these devices? In many instances talking aloud in a public space isn’t convenient, safe, or even possible.
And until these devices can do it all by themselves, they will have to serve as the second brick in the pocket in addition to the smartphone — would people be inclined to actively carry and operate two devices?
But maybe these devices will learn how to get better, just as the AI powering them does.
There is also the possibility that the smartphones of tomorrow will eat up these AI devices simply by powerfully integrating AI within their existing frameworks. But the smartphone will remain a highly distracting device. The way it works through the app ecosystem is that it needs you to be looking down on the screen and engaging with it to get anything done.
To the contrary, there is now a growing desire for devices that can do what the smartphone does but without the huge daily investment in time and attention that it demands. Perhaps, the way out of this conundrum would be to turn the entire world into a screen — by the use of smart glasses and headsets — and thereby merge the real with the reel.
The smartphone may, therefore, see itself losing out to these younger cousins who promise to give people back their time by freeing up their hands and eyes and… be more human?
That is, until advanced brain-human interfaces, like what Elon Musk’s Neuralink is now developing to enable computer control through thought, makes us superhuman.
Also Read: Why The Rabbit r1 Is A Big Deal
Karan Kamble writes on science and technology. He occasionally wears the hat of a video anchor for Swarajya's online video programmes.