How Amazon blew Alexa’s shot to dominate AI, according to employees who worked on it
With that phrase, David Limp, on the time Amazon’s head of gadgets and providers, confirmed off a brand new generative AI-powered model of the corporate’s signature Alexa voice assistant in September 2023.
At a packed occasion on the Seattle-based tech large’s lavish second headquarters within the Washington DC suburbs, Limp demonstrated the new Alexa for a room stuffed with reporters and cheering employees. He confirmed how in response to the brand new set off phrase, “Alexa, let’s chat,” the digital assistant responded in a much more pure and conversational voice than the friendly-but-robotic one which lots of of thousands and thousands have turn out to be accustomed to speaking with for climate updates, reminders, timers and music requests. Limp requested Alexa how his favourite soccer crew—Vanderbilt University—was doing. Alexa confirmed how it may reply in a joyful voice, and the way it may write a message to his buddies to remind them to watch the upcoming Vanderbilt soccer sport and ship it to his cellphone.
The new Alexa LLM, the corporate stated, would quickly be accessible as a free preview on Alexa-powered gadgets within the US. Rohit Prasad, Amazon’s SVP and Alexa chief stated the information marked a “massive transformation of the assistant we love,” and known as the brand new Alexa a “super agent.” It was clear the corporate wished to refute perceptions that the prevailing Alexa lacked smarts. (Microsoft CEO Satya Nadella reportedly known as it “dumb as a rock” in March 2023 as OpenAI’s ChatGPT rocketed to fame).
But after the occasion, there was radio silence—or digital assistant silence, because the case could also be. The conventional Alexa voice by no means modified on the half-a-billion gadgets which have been bought globally, and little information emerged over the approaching months concerning the new generative AI Alexa, apart from current reviews a couple of potential launch later this 12 months that could include a subscription charge.
The cause, according to interviews with greater than a dozen former employees who worked on AI for Alexa, is a company beset by structural dysfunction and technological challenges which have repeatedly delayed cargo of the brand new generative AI-powered Alexa. Overall, the previous employees paint an image of an organization desperately behind its Big Tech rivals Google, Microsoft, and Meta within the race to launch AI chatbots and brokers, and floundering in its efforts to catch up.
The September 2023 demo, the previous employees emphasize, was simply that—a demo. The new Alexa was not prepared for a first-rate time rollout, and nonetheless isn’t. The Alexa massive language mannequin (LLM), that sits on the coronary heart of the brand new Alexa, and which Amazon positioned as taking on OpenAI’s ChatGPT, is, according to former employees, removed from state-of-the-art. Research scientists who worked on the LLM stated Amazon doesn’t have sufficient information or entry to the specialised laptop chips wanted to run LLMs to compete with rival efforts at corporations like OpenAI. Amazon has additionally, former employees say, repeatedly deprioritized the brand new Alexa in favor of constructing generative AI for Amazon’s cloud computing unit, AWS. And whereas Amazon has constructed a partnership and invested $4 billion in AI startup Anthropic, whose LLM mannequin Claude is taken into account aggressive with OpenAI’s fashions, it has been unable to capitalize on that relationship to construct a greater Alexa. Privacy considerations have stored Alexa’s groups from utilizing Anthropic’s Claude mannequin, former employees say—however so too have Amazon’s ego-driven inner politics.
An Amazon spokesperson stated particulars offered by the previous analysis scientists for this story have been “dated” —regardless that many of those sources left the corporate prior to now six months—and didn’t replicate the present state of the Alexa LLM. She added that the corporate has entry to lots of of hundreds of GPUs and different AI-specific chips. She additionally disputed the concept Alexa has been deprioritized or that Anthropic’s Claude has been off-limits due to privateness considerations, however she declined to present proof of how Claude is getting used within the new Alexa.
While features of Amazon’s battle to replace Alexa are distinctive, the corporate’s challenges give a sign of how tough it is for corporations to revamp digital assistants constructed on older applied sciences to incorporate generative AI. Apple, too, has confronted related struggles to combine AI into its merchandise, together with its digital assistant Siri. Siri and Alexa share an identical technological pedigree—actually, Siri debuted three years prior to Alexa, in October 2011. And like Amazon, Apple underinvested within the form of AI experience wanted to construct the huge language fashions that underpin at this time’s generative AI, and within the huge clusters of graphics processing models (GPUs), the specialised laptop chips such fashions require. Apple too, like Amazon, has launched a decided, however belated, effort to catch up.
David Paul Morris/Bloomberg by way of Getty Images
Apple took some massive steps in direction of regaining misplaced floor within the generative AI race with a set of highly-anticipated bulletins at its WWDC convention earlier this week. The debut included an enormous improve for Siri, together with a extra natural-sounding voice and the potential for “on-screen awareness,” which can finally permit Siri to take extra agent-like actions throughout apps. Apple additionally introduced a Siri integration with ChatGPT. Apple’s bulletins solely up the strain on Amazon to ship the brand new Alexa.
Unfortunately, there’s rising proof that Amazon is ill-prepared for this renewed battle of the digital assistants—regardless that many assumed the corporate would have been completely positioned to take Alexa into the generative AI age. Yesterday, Mihail Eric, a former senior machine studying scientist at Alexa AI, took to X (previously Twitter) to say simply that: In a put up titled “How Alexa dropped the ball on being the top conversational system on the planet,” Eric, who left Amazon in July 2021, identified that Alexa had bought over 500 million gadgets, “which is a mind-boggling user data moat,” and that “we had all the resources, talent, and momentum to become the unequivocal market leader in conversational AI.” But most of that tech by no means noticed the sunshine of day, he stated, as a result of Alexa AI “was riddled with technical and bureaucratic problems.” The dozen former employees Fortune spoke to over the previous month echo Eric’s account and add additional particulars to the story of how the Everything Company has failed to do that one factor. The former employees spoke anonymously to keep away from violating non-disclosure agreements or non-disparagement clauses that they had signed.
Amazon Alexa was caught flat-footed by ChatGPT
Well earlier than ChatGPT wowed the world in November 2022, there was Amazon’s Alexa. The digital assistant was launched in 2014 alongside the Echo smart speaker that served as its {hardware} interface. The digital assistant, Amazon stated, had been impressed by the all-knowing laptop featured on Star Trek (Amazon founder Jeff Bezos is an enormous Star Trek fan). The product rapidly turned successful with shoppers, promoting over 20 million gadgets by 2017. But Alexa was not constructed on the identical AI fashions and strategies that made ChatGPT groundbreaking. Instead, it was a group of small machine studying fashions and hundreds of hand-crafted and hard-coded guidelines that turned a person’s utterances into the actions Alexa carried out.
Amazon had been experimenting with some early massive language fashions—all of them a lot smaller than GPT-3 and GPT-4, the 2 fashions OpenAI would use to energy ChatGPT—however these have been nowhere close to prepared for deployment in a product. The firm was caught flat-footed by the generative AI increase that adopted ChatGPT’s late November 2022 launch, former employees say. A frantic, frenetic few months adopted as Amazon’s Alexa group struggled to coalesce round a imaginative and prescient to take the digital assistant from a stilted command-action bot to a very conversational, useful agent. Non-generative AI initiatives have been deprioritized in a single day, and all through the 2022 Christmas interval executives urged Amazon’s scientists, engineers and product managers to work out how to guarantee Amazon had generative AI merchandise to provide prospects. One former Alexa AI undertaking supervisor described the ambiance on the firm as “a bit panicked.”
Amazon’s response virtually instantly bumped into bother, as numerous groups inside Alexa and AWS failed to coalesce round a unified plan. Many employees have been nonetheless working remotely following the Covid pandemic, main to individuals being endlessly “huddled on conference calls debating the minutiae of strategic PRFAQs” (Amazon-speak for a written doc used when proposing a product concept in its early levels), the Alexa AI undertaking supervisor stated. The firm struggled, he stated, to “shift from peacetime to wartime mode.”
One senior Alexa information scientist stated this was particularly irritating as a result of he had tried to sound the alarm on the approaching wave of generative AI way back to mid-2022, gathering information to present his director-level management, however he stated he couldn’t persuade them that the corporate wanted to change its AI technique. Only after ChatGPT launched did the corporate swing into motion, he defined.
PATRICK T. FALLON/AFP by way of Getty Images
The drawback is, as lots of of thousands and thousands are conscious from their stilted discourse with Alexa, the assistant was not constructed for, and has by no means been primarily used for, back-and-forth conversations. Instead, it all the time centered on what the Alexa group calls “utterances” — the questions and instructions like “what’s the weather?” or “turn on the lights” that individuals bark at Alexa.
In the primary months after ChatGPT launched, it was not clear LLMs would give you the chance to set off these real-world actions from a pure dialog, one Ph.D. analysis scientist who interned on the Alexa crew throughout this era stated. “The idea that an LLM could ‘switch on the lights’ when you said ‘I can’t see, turn it all on’ was not proven yet,” he stated. “So the leaders internally clearly had big plans, but they didn’t really know what they were getting into.” (It is now broadly accepted that LLMs can, a minimum of in principle, be coupled with different expertise to management digital instruments.)
Instead, groups have been determining how to implement generative AI on the fly. That included creating artificial datasets —on this case, collections of computer-generated dialogues with a chatbot—that they may use to prepare an LLM. Those constructing AI fashions usually use artificial information when there isn’t sufficient real-world information to enhance AI accuracy, or when privateness safety is required— and bear in mind, most of what the Alexa crew had have been easy, declarative “utterances.”
“[Customers were] talking in Alexa language,” one former Amazon machine studying scientist stated. “So now imagine you want to encourage people to talk in language that has never happened—so where are you going to get the data from to train the model? You have to create it, but that comes with a whole lot of hurdles because there’s a gazillion ways people can say the same thing.”
Also, whereas Alexa has been built-in with hundreds of third-party gadgets and providers, it seems that LLMs aren’t terribly good at dealing with such integrations. According to a former Alexa machine studying supervisor, who worked on Alexa’s sensible dwelling capabilities, even OpenAI’s newest GPT 4o mannequin, or the latest Google Gemini mannequin—which each are ready to use voice, quite than simply textual content—battle to go from spoken dialogue to performing a process utilizing different software program. That requires what is named an API name and LLMs don’t do that nicely but.
“It’s not consistent enough, it hallucinates, gets things wrong, it’s hard to build an experience when you’re connecting to many different devices,” the previous machine studying scientist stated.
As spring gave method to the summer time of 2023, many in Alexa’s rank and file remained at the hours of darkness about how the digital assistant would meet the generative AI second. The undertaking lacked imaginative and prescient, former employees stated. “I remember my team and myself complaining a lot to our superiors that it wasn’t transparent what the vision looks like—it wasn’t transparent what exactly we’re trying to launch,” one stated. Another former supervisor stated the brand new Alexa LLM was talked about within the months prior to the September demo, however it wasn’t clear what it would imply. “We were just hearing things like, ‘Oh yeah, this is coming,’” he stated. “But we had no idea what it was or what it would look like.”
Matt McClain/The Washington Post by way of Getty Images
Alexa LLM demo didn’t meet ‘go/no-go’ standards
The September 2023 Alexa demo made it appear to be a widespread rollout of the brand new Alexa LLM was imminent. But the brand new language model-based Alexa finally “didn’t meet the go/no-go criteria,” one former worker stated. LLMs are identified for producing hallucinations and generally poisonous content material, and Amazon’s was no completely different, making broad launch dangerous.
This, former employees say, is the rationale Alexa’s “Let’s Chat” characteristic has by no means made it into broad launch. “It’s very hard to make AI safe enough and test all aspects of that black box in order to release it,” a former supervisor stated.
The September 2023 demo, he identified, concerned completely different performance than what Alexa was greatest identified for— that’s, taking a command and executing it. Ensuring Alexa may nonetheless carry out these previous capabilities whereas additionally enabling the conversational dialogue the brand new Alexa promised can be no simple process. The supervisor stated it was more and more clear to him that the group would, a minimum of quickly, want to keep two utterly completely different expertise stacks—one supporting Alexa’s previous options and one other the brand new ones. But managers didn’t need to entertain that concept, he stated. Instead, the message on the firm on the time he was laid off in November 2023 was nonetheless “we need to basically burn the bridge with the old Alexa AI model and pivot to only working on the new one.”
Even as the brand new Alexa LLM rollout floundered, Amazon executives set ever extra lofty generative AI objectives. Right earlier than the demo, Prasad, the Amazon SVP who had served as Alexa’s head scientist, was promoted to a brand new position designed to convey the corporate’s disparate analysis groups underneath a single umbrella, with a aim to develop human-level synthetic normal intelligence, or AGI. The transfer put Amazon in direct competitors with corporations like OpenAI, Google DeepMind, and Anthropic, which have the creation of AGI as their founding mission. Meta CEO Mark Zuckerberg has additionally not too long ago stated that creating AGI is his firm’s mission too.
By November 2023, there was word that Amazon was investing thousands and thousands in coaching an AI mannequin, codenamed Olympus, that may have 2 trillion parameters—or tunable variables. Parameters are a tough approximation of a mannequin’s dimension and complexity. And Olympus’s reported parameter depend would make it double the reported dimension of OpenAI’s most succesful mannequin, GPT-4.
The former analysis scientist working on the Alexa LLM stated Project Olympus is “a joke,” including that the most important mannequin in progress is 470 billion parameters. He additionally emphasised that the present Alexa LLM model is unchanged from the 100 billion-parameter mannequin that was used for the September 2023 demo, however has had extra pretraining and effective tuning finished on it to enhance it. (To be certain, 100 billion parameters remains to be a comparatively highly effective mannequin. Meta’s Llama 3, as a comparability, weighs in at 70 billion parameters).
A scarcity of information made it robust to ‘get some magic’ out of the LLM
In the months following the September 2023 demo, a former analysis scientist who worked on constructing the brand new Alexa LLM recalled how Alexa management, together with Amazon’s generative AI chief Rohit Prasad, pushed the crew to work more durable and more durable. The message was to “get some magic” out of the LLM, the analysis scientist stated. But the magic by no means occurred. A scarcity of sufficient information was one of many essential explanation why, former employees stated.
Meta’s Llama 3 was pre-trained on 15 trillion tokens, the smallest unit of information that an LLM processes. The Alexa LLM has solely been skilled on 3 trillion. (Unlike parameters, that are the variety of tunable settings {that a} mannequin has, a token is the small unit of information – resembling a phrase – that the mannequin processes throughout coaching). Meanwhile, “fine-tuning” an AI mannequin—which takes a pre-trained mannequin and additional hones it for particular duties—additionally advantages from bigger datasets than what Amazon has on the prepared. Meta’s Llama 3 mannequin was fine-tuned on 10 million information factors. The LLM constructed by Amazon’s AGI group has to this point amassed solely round 1 million, with solely 500,000 high-quality information factors, the previous Alexa LLM analysis scientist stated.
Al Drago/Bloomberg by way of Getty Images
One of the numerous causes for that, he defined, is that Amazon insists on utilizing its personal information annotators (individuals chargeable for labeling information in order that AI fashions can acknowledge patterns) and that group may be very gradual. “So we can never never get high quality data from them after several rounds, even after one year of developing the model,” he stated.
Beyond a paucity of information, the Alexa crew additionally lacks entry to the huge portions of the newest Nvidia GPUs, the specialised chips used to prepare and run AI fashions, that the groups at OpenAI, Meta, and Google have, two sources informed Fortune. “Most of the GPUs are still A100, not H100,” the previous Alexa LLM analysis scientist added, referring to probably the most highly effective GPU Nvidia presently has accessible.
At instances, constructing the brand new Alexa has taken a backseat to different generative AI priorities at Amazon, they stated. Amazon’s essential focus after ChatGPT launched was to roll out Bedrock, a brand new AWS cloud computing service that allowed prospects to construct generative AI chatbots and different purposes within the cloud—which was introduced in April 2023 and made typically accessible in September. AWS is a essential profit-driver for Amazon.
Alexa, on the opposite hand, is a price heart—the division reportedly loses billions every year—and is generally seen as a method to maintain prospects engaged with Amazon and as a method to collect information that may assist Amazon and its companions higher goal promoting. The LLM that Amazon scientists are constructing (a model of which may also energy Alexa) can be first being rolled out to AWS’ business-focused generative AI assistant Amazon Q, stated a former Alexa LLM scientist who left inside the previous few months, as a result of the mannequin is now thought of adequate for particular enterprise use instances. Amazon Q additionally faucets Anthropic’s Claude AI mannequin. But Alexa’s LLM crew has not been allowed to use Claude due to considerations about information privateness.
Amazon’s spokesperson stated the assertion about Claude and privateness is fake, and disputed different particulars about Amazon’s LLM effort that Fortune heard from a number of sources. “It’s simply inaccurate to state Amazon Q is a higher priority than Alexa. It’s also incorrect to state that we’re using the same LLM for Q and Alexa.”
Bureaucracy and infrastructure points slowed down Alexa’s gen AI efforts
One former Alexa AI worker who has employed a number of employees who had been working on the brand new Alexa LLM stated that almost all have talked about “feeling exhausted” by the fixed strain to prepared the mannequin for a launch that’s repeatedly postponed—and annoyed as a result of different work is on maintain till within the meantime. A number of have additionally conveyed a rising skepticism as to whether or not the general design of the LLM-based Alexa even is smart, he added.
“One story I heard was that early in the project, there was a big push from senior executives who had become overconfident after experimenting with ChatGPT, and that this overconfidence has persisted among some senior leaders who continue to drive toward an unrealistic-feeling goal,” he stated. Another former Alexa LLM scientist stated managers set unachievable deadlines. “Every time the managers assigned us a task related to [the] LLM, they requested us to complete it within a very short period of time (e.g., 2 days, one week), which is impossible,” he stated. “It seems the leadership doesn’t know anything about LLMs—they don’t know how many people they need and what should be the expected time to complete each task for building a successful product like ChatGPT.”
Stephanie Foden/Bloomberg by way of Getty Images
Alexa by no means aligned with Jeff Bezos’ concept of “two-pizza teams”—that’s, that groups ought to ideally be sufficiently small that you would cater a full crew assembly with simply two pizzas. Bezos thought smaller groups drove efficient decision-making and collaboration. Instead, Alexa has traditionally been—and stays, for probably the most half—an enormous division. Prior to the latest layoffs, it had 10,000 employees. And whereas it has fewer now, it remains to be organized into massive, siloed domains resembling Alexa Home, Alexa Entertainment, Alexa Music and Alexa Shopping, every with lots of of employees, together with administrators and a VP on the high.
As strain grew for every area to work with the brand new Alexa LLM to craft generative AI options, every of which required accuracy benchmarks, the domains got here into battle, with generally counterproductive outcomes, sources stated.
For occasion, a machine studying scientist working on Alexa Home recalled that whereas his area was working on methods for Alexa to assist customers management their lights or the thermostat, the Music area was busy working on how to get Alexa to perceive very particular requests like “play Rihanna, then Tupac, and then pause 30 minutes and then play DMX.”
Each area crew had to construct its personal relationship with the central Alexa LLM crew. “We spent months working with those LLM guys just to understand their structure and what data we could give them to fine-tune the model to make it work.” Each crew wished to fine-tune the AI mannequin for its personal area objectives.
But as it turned out, if the Home crew tried to fine-tuned the Alexa LLM to make it extra succesful for Home questions, after which the Music crew got here alongside and fine-tuned it utilizing their very own information for Music, the mannequin would wind up performing worse. “Catastrophic forgetting,” the place what a mannequin learns later in coaching degrades its skill to carry out nicely on duties it encountered earlier in coaching is an issue with all deep studying fashions. “As it gets better in Music, [the model] can get less smart at Home,” the machine studying scientist stated. “So finding the sweet spot where you’re trying to fine tune for 12 domains is almost a lottery.” These days, he added, LLM scientists know that effective tuning might not be one of the best approach for making a mannequin with each wealthy capabilities and adaptability—there are others, like immediate engineering, that may do higher. But by then, many months had passed by with little progress to present for it.
Each Alexa area, with its personal management, wished to shield and broaden its fiefdom, one former product supervisor stated. “This organization has just turned out into something like a mafia,” she stated. “Let’s say, if I work for you, I’m just taking orders because it is in my best interest to agree with you. It is my best interest to not get chopped off in the next layoff—it’s quite ruthless. It’s in my best interest because you’re going to help me build my empire.”
Amazon says it stands by its dedication to Alexa
Amazon insists it is absolutely dedicated to delivering a generative AI Alexa, including that its imaginative and prescient stays to construct the “world’s best personal assistant.” An Amazon consultant identified that over half a billion Alexa-enabled gadgets have been bought, and prospects work together with Alexa tens of thousands and thousands of instances each hour.
Phillip Faraone/Getty Images for WIRED25
She added that the implementation of generative AI comes with “huge responsibility—the details really matter” with a technical implementation of this scale, on a tool that thousands and thousands of consumers have welcomed into their dwelling. While the Alexa LLM “Let’s chat” characteristic has not been rolled out to most people, it has been examined on small teams of consumers “on an ongoing basis.”
But lots of the employees Fortune spoke to stated they left partly as a result of they despaired that the brand new Alexa would ever be prepared—or that by the point it is, it can have been overtaken by merchandise launched by nimbler opponents, resembling OpenAI. Those corporations don’t have to navigate an present tech stack and defend an present characteristic set. The former worker who has employed a number of who left the Alexa group over the previous 12 months stated many have been pessimistic concerning the Alexa LLM launch. “They just didn’t see that it was actually going to happen,” he stated.
It’s doable, say a number of the employees Fortune interviewed, that Amazon will lastly launch an LLM-based Alexa — and that it will likely be an enchancment to at this time’s Alexa. After all, there are lots of of thousands and thousands of Alexa customers on the market on the planet who would definitely be completely satisfied if the gadget sitting on their desk or kitchen counter may do greater than execute easy instructions.
But given the challenges weighing down the Alexa LLM effort, and the hole separating it from the choices of generative AI leaders like OpenAI and Google, not one of the sources Fortune spoke with imagine Alexa is shut to carrying out Amazon’s mission of being “the world’s best personal assistant,” not to mention Amazon founder Jeff Bezos’ imaginative and prescient of making a real-life model of the useful Star Trek laptop. Instead, Amazon’s Alexa runs the chance of changing into a digital relic with a cautionary story— that of a probably game-changing expertise that acquired caught enjoying the mistaken sport.