The AI arms race continues apace: Anthropic is launching its latest model, called the Claude 3.5 Sonnet, which it says can equal or outperform OpenAI’s GPT-4o or Google’s Gemini across a wide range of tasks. The new model is already available to Claude users on the web and On iOSand Anthropic will make it available to developers as well.
The Claude 3.5 Sonnet will ultimately be the middle model in the lineup — Anthropic uses the Haiku name for its smallest model, Sonnet for its mainstream middle option, and Opus for its top model. (The names are weird, but every AI company seems to name things in their own weird ways, so we’ll let it slide.) But the company says the 3.5 Sonnet beats the 3 Opus, and its benchmarks show it does so by a very wide margin. The new model appears to be twice as fast as the previous model, which may be a bigger deal.
AI model parameters should always be taken with a grain of salt; There are so many of them, it’s easy to pick and choose what makes you look good, and models and products change so quickly that no one seems to move forward for very long. Still, the Claude 3.5 Sonnet looks impressive: it beats the GPT-4o, Gemini 1.5 Pro, and Meta’s Llama 3 400B on seven of nine overall criteria and four of five vision criteria. Again, don’t read too much into this, but Anthropic appears to have built a legitimate competitor in this space.
What does all this actually mean? Anthropic says the Claude 3.5 Sonnet will be much better at writing and compiling code, handling multi-step workflows, interpreting charts and graphs, and transcribing text from images. The new and improved Claude also seems to be better at understanding humor and can write in a more human way.
Along with the new model, Anthropic is also introducing a new feature called Artifacts. With Artifacts, you’ll be able to see and interact with the results of your Claude requests: if you ask a model to design something for you, it can now show you what it looks like and let you edit it directly in the app. If Claude writes you an email, you can edit the email in the Claude app instead of having to copy it into a text editor. It’s a small feature, but a smart one — these AI tools need to become more than simple chatbots, and features like Artifacts give the app more to do.
The artifact actually appears to be a nod to Claude’s long-term vision. Anthropic has long said it focuses mostly on enterprises (even when it hires people in consumer technology, such as… Instagram co-founder Mike Krieger) It said in its press release announcing Claude 3.5 Sonnet that it plans to turn Claude into a tool for companies to “securely centralize their knowledge, documents, and ongoing work in one shared space.” This feels more like Notion or Slack than ChatGPT, with Anthropic Models at the center of the entire system.
But for now, this model is big news. The pace of improvement here is exciting to watch: human Claude launched 3 Opus in Marchproudly saying that it was just as good as GPT-4 and Gemini 1.0 before OpenAI And Google It released better versions of its models. Now, Anthropic has made its next move, and it certainly won’t be long before its competitor does too. Claude isn’t talked about as much as Gemini or ChatGPT, but he’s in the running.