Thursday, January 23, 2025

AI agents could finally make Siri and Alexa truly useful

When newly minted Google CEO Sundar Pichai introduced the Google Assistant in 2016 as a part of his recent “AI-first” agenda, he touted the young voice assistant as a tool designed to assist people complete tasks.

“Google Assistant empowers you to get things done by giving you the information you need, whenever you need it, wherever you are,” he wrote in a single Blog post on the time.

It was a lofty goal that was largely missed. All too often, the software stumbles when faced with a question, defaults to an internet search and apologizes that it cannot help. This led to people limiting voice assistants to easy tasks like setting cooking timers, playing music or controlling their lights. Amazon’s Alexa, which launched a decade ago, didn’t fare a lot better. Siri, the oldest device, was introduced by Apple in 2011 panned probably the most.

But as generative AI has turn into mainstream over the past two years, it has paved the way in which for AI “agents”: software specifically programmed to take actions or complete tasks, equivalent to making a reservation, on a user’s behalf to book or buy something online. And because the “agent era,” as Pichai calls it, begins in 2025, technology has a likelihood to do something that has up to now eluded the large tech platforms: make their voice assistants actually useful.

This signifies that Google Assistant, Alexa and Siri could finally fulfill their promise of acting like personal assistants. Instead of just reciting your meeting schedule for the day, like Google Assistant can now, it may very well have the ability to book meetings, reach out to contacts, and discover a time that works for each people. You can have the power to book your flights and hotels for an enormous vacation like a digital travel agent, with little more information than travel dates and destination.

According to a Forrester study, agents are the most recent trend within the technology industry. More than 470 platforms are dedicated to this technology. This ranges from large tech giants to smaller startups like LangChain, CrewAI and Play.ai. Beyond consumer functions, they might also have the ability to rework businesses by deploying agents for customer support or software development. According to PitchBook, the variety of deals for AI agent startups has increased by greater than 81% within the last yr, with greater than $8 billion invested within the space.

“The race is on,” said Steve Jang, a Forbes Midas List investor and founding father of Kindred Ventures. “Startups will be competing with the established platforms to see who can orchestrate this with much higher fidelity. And who can create much more humanistic and realistic voices and conversations and access the data and actions we all want?”

The major language assistants are ideally equipped for such an AI start. Google has its flagship model, Gemini, to enhance its voice search. Apple announced a partnership with OpenAI earlier this yr to make use of ChatGPT to support some Siri requests. And last yr, Amazon invested $8 billion in Anthropic, which makes the powerful Claude chatbot. Google declined to make any of its executives available for interviews. Apple and Amazon didn’t reply to interview requests.

“I only use Siri for trivial things that I know won’t screw things up.”

Kanjun Qiu, co-founder, Imbue

Jang believes the actual innovation will occur in actual voice AI models. Unlike large language models that underlie services like ChatGPT, language models are usually not trained on text after which read aloud by the software. Instead, speech models are trained on actual speech audio data, allowing them to acknowledge subtleties of speech equivalent to cadence or emotional cues. Jang has invested in Play.ai, which makes a speciality of voice agents; It competes with corporations like ElevenLabs, OpenAI and Google, all of that are working on language models.

However, some are usually not so convinced that agents will exponentially improve on the foremost voice assistants. Kanjun Qiu, founding father of Imbue, an organization that develops agents for coding software, believes adding more AI to those products will only improve them “incrementally.” She said the brand new AI capabilities still aren’t sufficiently big to trust. “Delegating as a paradigm is actually very difficult for people,” Qiu said. “I only use Siri for trivial things that I know won’t screw things up.”

However, she believes recent improvements in voice AI will help consumers in other ways. For example, more apps will integrate voice capabilities, she predicts. With improved latency and natural language understanding, you’ll be able to give instructions to an app and it should perform that motion, Qiu said — like telling an e-commerce app that you ought to return the pair of shoes that do not quite fit. (An engineer by training, she said she developed an app for herself that turns the chatter right into a to-do list.)

Improvements in AI and voice technology could also enable hardware ambitions that Silicon Valley has looked for years. More than a decade ago, Google sparked an infamous facet crime with the introduction of Google Glass, a pair of smart glasses that raised privacy concerns and weren’t very useful. Earlier this month, the corporate unveiled a brand new prototype of glasses for use with Project Astra, Google’s recent platform for AI agents. In a demo, the voice-controlled glasses mechanically retrieved a door code from the wearer’s email as soon as they checked out the input keypad. The technician could also conjure route information concerning the bus in front of him or the art sculpture he passed.

Meanwhile, Facebook’s Orion glasses, announced earlier this yr, use a mixture of voice and hand gestures to regulate AI tools, equivalent to ingredients within the pantry and asking the technician to search out a recipe that uses them .

Voice-based innovations also make technology more accessible. Not everyone can read, write or type, but more people have the power to talk, Jang said. And it is a growing preference amongst young people: According to a study by YouGov and Vox, 42% of 18- to 29-year-olds within the US send voice messages on their chat apps no less than weekly.

New advances in AI could increase the adoption of voice tools even further and alter the way in which people interact with their technology. “It turns voice agents — and voice itself — into this great new user interface that hasn’t been used before in the computing industry,” Jang said.

MORE FROM FORBES

ForbesThese startups make AI corporations pay to take over contentForbesGoogle’s prototype glasses put an AI agent in your faceForbesProgrammers fear this $2 billion startup’s AI could replace their jobs

Latest news
Related news

LEAVE A REPLY

Please enter your comment!
Please enter your name here