Last week, Google unveiled its biggest change to search in years, showcasing new artificial intelligence capabilities that answer people’s questions in the company’s attempt to catch up to rivals Microsoft and OpenAI.
The new technology has since generated a litany of untruths and errors — including recommending glue as part of a pizza recipe and the ingesting of rocks for nutrients — giving a black eye to Google and causing a furor online.
The incorrect answers in the feature, called AI Overview, have undermined trust in a search engine that more than two billion people turn to for authoritative information. And while other A.I. chatbots tell lies and act weird, the backlash demonstrated that Google is under more pressure to safely incorporate A.I. into its search engine.
The launch also extends a pattern of Google’s having issues with its newest A.I. features immediately after rolling them out. In February 2023, when Google announced Bard, a chatbot to battle ChatGPT, it shared incorrect information about outer space. The company’s market value subsequently dropped by $100 billion.
This February, the company released Bard’s successor, Gemini, a chatbot that could generate images and act as a voice-operated digital assistant. Users quickly realized that the system refused to generate images of white people in most instances and drew inaccurate depictions of historical figures.
With each mishap, tech industry insiders have criticized the company for dropping the ball. But in interviews, financial analysts said Google needed to move quickly to keep up with its rivals, even if it meant growing pains.
Google “doesn’t have a choice right now,” Thomas Monteiro, a Google analyst at Investing.com, said in an interview. “Companies need to move really fast, even if that includes skipping a few steps along the way. The user experience will just have to catch up.”
Lara Levin, a Google spokeswoman, said in a statement that the vast majority of AI Overview queries resulted in “high-quality information, with links to dig deeper on the web.” The A.I.-generated result from the tool typically appears at the top of a results page.
“Many of the examples we’ve seen have been uncommon queries, and we’ve also seen examples that were doctored or that we couldn’t reproduce,” she added. The company will use “isolated examples” of problematic answers to refine its system.
Since OpenAI released its ChatGPT chatbot in late 2022 and it became an overnight sensation, Google has been under pressure to integrate A.I. into its popular apps. But there are challenges in taming large language models, which learn from enormous amounts of data taken from the open web — including falsehoods and satirical posts — rather than being programmed like traditional software.
(The New York Times sued OpenAI and its partner, Microsoft, in December, claiming copyright infringement of news content related to A.I. systems.)
Google announced AI Overview to fanfare at its annual developer conference, I/O, last week. For the first time, the company had plugged Gemini, its latest large language A.I. model, into its most important product, its search engine.
AI Overview combines statements generated from its language models with snippets from live links across the web. It can cite its sources, but does not know when that source is incorrect.
The system was designed to answer more complex and specific questions than regular search. The result, the company said, was that the public would be able to benefit from all that Gemini could do, taking some of the work out of searching for information.
But things quickly went awry, and users posted screenshots of problematic examples to social media platforms like X.
AI Overview instructed some users to mix nontoxic glue into their pizza sauce to prevent the cheese from sliding off, a fake recipe it seemed to borrow from an 11-year-old Reddit post meant to be a joke. The A.I. told other users to ingest at least one rock a day for vitamins and minerals — advice that originated in a satirical post from The Onion.
As the company’s cash cow, Google search is “the one property Google needs to keep relevant/trustworthy/useful,” Gergely Orosz, a software engineer with a newsletter on technology, Pragmatic Engineer, wrote on X. “And yet, examples on how AI overviews are turning Google search into garbage are all over my timeline.”
People also shared examples of Google’s telling users in bold font to clean their washing machines using “chlorine bleach and white vinegar,” a mixture that when combined can create harmful chlorine gas. In a smaller font, it told users to clean with one, then the other.
Social media users have tried to one-up one another with who could share the most outlandish responses from Google. In some cases, they doctored the results. One manipulated screenshot appeared to show Google saying that a good remedy for depression was jumping off the Golden Gate Bridge, citing a Reddit user. Ms. Levin, the Google spokeswoman, said that the company’s systems never returned that result.
AI Overview did, however, struggle with presidential history, saying that 17 presidents were white and that Barack Obama was the first Muslim president, according to screenshots posted to X.
It also said Andrew Jackson graduated from college in 2005.
Kevin Roose contributed reporting.