OpenAI competition Anthropic It launches a powerful new AI model called Claude 3.5 Sonnet. But it is more of an incremental step than a giant leap forward.
Claude 3.5 Sonnet can analyze both text and images as well as generate text, and is the best-performing anthropic model to date – at least on paper. Across several AI benchmarks for reading, coding, math, and vision, Claude 3.5 Sonnet outperforms the model it replaces, Claude 3 Sonnet. And It outperforms Anthropic’s previous flagship model Claude 3 Opus.
Standards Not necessarily the most useful measure of AI progressThis is partly because many of them test fringe, esoteric situations that do not apply to the average person, such as answering health test questions. But for what it’s worth, Claude 3.5 Sonnet barely Top competing leading models, including the recently launched OpenAI GPT-4oby some standards tested by Anthropic.
Along with the new model, Anthropic is launching what it calls Artifacts, a workspace where users can edit and add content — such as code and documents — created by Anthropic models. Anthropic says Artifacts is currently in preview, and will gain new features, such as ways to collaborate with larger teams and store knowledge bases, in the near future.
Focus on efficiency
The Claude 3.5 Sonnet is slightly more performant than the Claude 3 Opus, and Anthropic says the model better understands precise and complex instructions, as well as concepts Like humor. (Artificial intelligence is Not funnyBut perhaps more importantly for developers who build applications with Claude that require quick responses (such as customer service chatbots), Claude 3.5 Sonnet is faster. It’s twice as fast as the Claude 3 Opus, Anthropic claims.
Vision—image analysis—is one area in which the Claude 3.5 Sonnet has improved significantly over the 3 Opus, according to Anthropic. Claude 3.5 Sonnet can more accurately interpret charts and graphs and transcribe text from “imperfect” images, such as images with distortions and optical artifacts.
Michael Gerstenhaber, head of product at Anthropic, says the improvements came from architectural tweaks and new training data, including data generated by artificial intelligence. What exactly is data? Gerstenhaber did not reveal this, but he implied that the Claude 3.5 Sonnet derives much of its power from these practice groups.
“what matters [businesses] “It’s about whether or not AI helps them meet their business needs, not whether or not AI can compete against the standards,” Gerstenhaber told TechCrunch. “From that perspective, I think the Claude 3.5 Sonnet will be a step ahead of anything else we have available – and also ahead of anything else in the industry.”
Confidentiality surrounding training data may be for competitive reasons. But it can also protect anthropology from legal challenges – particularly its own Fair use. Courts have yet to decide whether vendors like Anthropic and its competitors, like OpenAI, Google, Amazon, etc., have the right to train on public data, including copyrighted data, without compensation or credit to the creators of that data.
So, all we know is that Claude 3.5 Sonnet was trained on a lot of text and images, like previous Anthropic models, as well as feedback from human testers to try to “align” the model with users’ intentions, in hopes of preventing it from releasing toxic or otherwise toxic substances. Problematic text.
What do we know? Well, the Claude 3.5 Sonnet context window — the amount of text the model can parse before generating new text — is 200,000 characters, the same number as the Claude 3 Sonnet. Tokens are segmented pieces of raw data, such as the phonemes “fan,” “tas,” and “tic” in the word “fantastic”; 200,000 symbols equal about 150,000 words.
We know that Claude 3.5 Sonnet is available today. Free users of the Anthropic web client and the Claude iOS app can access it for free; Subscribers to Anthropic’s paid plans Cloud Pro And Claude Team Get 5x higher rate limits. Claude 3.5 Sonnet also exists on Anthropic’s API and managed platforms such as Amazon Bedrock and Google Cloud’s Vertex AI.
“The Claude 3.5 Sonnet is truly a step change in intelligence without sacrificing speed, and it sets us up for future releases along the entire Claude model family,” Gerstenhaber said.
Claude 3.5 Sonnet also runs Artifacts, which pops up a custom window in the Claude web client when the user asks the form to generate content such as code snippets, text documents, or website designs. “Artefacts are model outputs that set aside generated content and allow you, as a user, to iterate on that content,” explains Gerstenhaber. “Let’s say you want to create code – the artifact will be placed in the UI, and then you can talk to Claude and iterate on the document to improve it.” So you can run the code.
The bigger picture
So what is the significance of Claude 3.5 Sonnet in the broader context of anthropology – and the AI ecosystem, for that matter?
Claude 3.5 Sonnet shows that incremental progress is the extent of what we can now expect on the paradigm front, barring significant advances in research. The past few months have seen major releases from Google (Gemini 1.5 Pro) and OpenAI (GPT-4o) which moves the needle marginally in terms of benchmark and qualitative performance. But there was no jump to match the jump from GPT-3 to GPT-4 At some point, this is due to the rigidity of current model architectures and the massive computation they require for training.
While generative AI vendors are turning their attention to… Data processing And Licensing Instead of promising new scalable architectures, there are signs of investors They became cautious The longer-than-expected path to return on investment for generative AI. Anthropy is somewhat immune to this pressure, being in an enviable position Amazon (And to a lesser extent Google) Insurance against OpenAI. But the company’s revenues are expected to reach Just under a billion dollars By the end of 2024, it is part From OpenAI – and I’m sure Anthropic’s backers don’t let it forget that fact.
Despite a growing customer base that includes household brands like Bridgewater, Brave, Slack, and DuckDuckGo, Anthropic still lacks a certain corporate feel. Interestingly, it was OpenAI that used OpenAI technology, not a human PricewaterhouseCoopers recently entered into a partnership To resell generative AI offerings to the enterprise.
So Anthropic is taking a strategic and well-thought-out approach to successes, investing development time in products like the Claude 3.5 Sonnet to deliver slightly better performance at commodity prices. Claude 3.5 Sonnet is priced the same as Claude 3 Sonnet: $3 per million tokens entered into the form and $15 per million tokens generated by the form.
Gerstenhaber talked about this in our conversation. “When you build an app, the end user doesn’t have to know which model is being used or how an engineer has improved their experience, but the engineer can have the tools available to improve that experience along the vectors that need improvement, and cost is certainly one of them,” he said.
Claude 3.5 Sonnet It does not solve the problem of hallucinations. He almost certainly makes mistakes. But it might be attractive enough to get developers and businesses to switch to Anthropic’s platform. Ultimately, this is what matters to Anthropics.
To that same end, Anthropic has doubled down on tools like it AI pilot guidanceWhich allows developers to “pipe” the internal features of their models; Integrations to allow their models to take actions within applications; and built tools On top Of its models such as the experience of the artifacts mentioned above. She has also been appointed as the co-founder of Instagram Head of product. indeed Expanded availability for its products, most recently bringing Claude to Europe and establishing offices in London and Dublin.
Anthropy seems to have gotten the idea that building an ecosystem around models — not just isolated models — is the key to retaining customers while narrowing the capabilities gap between models.
However, Gerstenhaber insisted that bigger and better models – such as the Claude 3.5 Opus – are on the near horizon, with features such as web search and the ability to remember preferences.
“I have not seen Deep learning has reached a dead end so far“I’ll leave it to the researchers to speculate about the wall, but I think it’s a little early to draw conclusions about that, especially if you look at the pace of innovation,” he said. “There is very rapid development and very rapid innovation, and I have no reason to believe it will slow down.”
we will see.