Jaque Silva | SOPA Images | Lightrocket | Getty Images
Google on Wednesday released Gemini 2.0 — its “most capable” artificial intelligence model suite yet — to everyone.
In December, the company gave access to developers and trusted testers, as well as wrapping some features into Google products, but this is a “general release,” according to Google.
The suite of models includes 2.0 Flash, which is billed as a “workhorse model, optimal for high-volume, high-frequency tasks at scale”; 2.0 Pro Experimental, which is largely focused on coding performance; and 2.0 Flash-Lite, which Google bills as its “most cost-efficient model yet.”
Gemini Flash costs developers 10 cents per million tokens for text, image and video inputs, while Flash-Lite, its more cost-effective version, costs .75 of a cent for the same.
The continued releases are part of a broader strategy for Google of investing heavily into “AI agents” as the AI arms race heats up among tech giants and startups alike.
Meta, Amazon, Microsoft, OpenAI and Anthropic are also moving toward agentic AI, or models that can complete complex multi-step tasks on a user’s behalf, rather than a user having to walk them through every individual step.
“Over the last year, we have been investing in developing more agentic models, meaning they can understand more about the world around you, think multiple steps ahead, and take action on your behalf, with your supervision,” Google wrote in a December blog post, adding that Gemini 2.0 has “new advances in multimodality — like native image and audio output — and native tool use,” and that the family of models “will enable us to build new AI agents that bring us closer to our vision of a universal assistant.”
Anthropic, the Amazon-backed AI startup founded by ex-OpenAI research executives, is a key competitor in the race to develop AI agents. In October, the startup said that its AI agents were able to use computers like humans to complete complex tasks. Anthropic’s computer use capability allows its technology to interpret what’s on a computer screen, select buttons, enter text, navigate websites and execute tasks through any software and real-time internet browsing, the startup said.
The tool can “use computers in basically the same way that we do,” Jared Kaplan, Anthropic’s chief science officer, told CNBC in an interview at the time. He said it can do tasks with “tens or even hundreds of steps.”
OpenAI released a similar tool recently, introducing a feature called Operator that will automate tasks such as planning vacations, filling out forms, making restaurant reservations and ordering groceries. The Microsoft-backed startup described Operator as “an agent that can go to the web to perform tasks for you.”
Earlier this week, OpenAI announced another tool called Deep Research that allows an AI agent to compile complex research reports and analyze questions and topics of the user’s choice. Google in December launched a similar tool of the same name — Deep Research — which acts as a “research assistant, exploring complex topics and compiling reports on your behalf.”
CNBC first reported in December that Google would introduce several AI features early in 2025.
“In history, you don’t always need to be first but you have to execute well and really be the best in class as a product,” CEO Sundar Pichai said in a strategy meeting at the time. “I think that’s what 2025 is all about.”
#Google #opens #Gemini #powerful #model