Don’t miss only the leaders of OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One at VentureBeat Transform 2024. Get key insights into GenAI and expand your network at this exclusive three-day event. He learns more

The new Large Language Model (LLM) appears to have taken the performance crown from OpenAI’s GPT-4o about a month after its release: Claude 3.5 Sonnet chatbot new and LLM from rival AI company Anthropic, released today, outperforms all other companies in the world on key third-party benchmark tests, according to the company. And it does so while being faster and cheaper than previous Claude 3 models.

But it’s one thing to drop a new model and claim dominance, and it’s another thing for users to really experience and benefit from the performance gains (the Google Gemini family — I’m looking at you: supposedly Better than OpenAI’s previous flagship GPT-4 in some metricsBut who is really using you?).

Anthropic’s recent release of Claude 3.5 Sonnet does not seem to have this problem. Many AI influencers and power users took to the web in the few hours following its release to share their largely positive impressions of Anthropic’s new model, and show what’s new, “the smartest” The LLM in the world is capable of achieving.

Develop programming skills and create products

As an enterprise AI influencer and expert Allie K. Miller wrote on XClaude 3.5 Sonnet was able to create a fully playable game for her based on just a screenshot, in less than half a minute:

Countdown to VB conversion 2024

Join enterprise leaders in San Francisco July 9-11 for our flagship AI event. Connect with your peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. Register now

Likewise, informative and timely X account @TestingCatalog News Show how the newly launched “Artifacts” playground — which debuted alongside Claude 3.5 Sonnet, quite literally, showing a display of interactive output alongside a chatbot interface — can execute code for a real, working web model created by Claude 3.5 Sonnet.

She was even able to recreate images from the 1995 film Pirates:

Pietro Schirano, founder of AI image generation startup EverArthe wrote on X That the combination of Claude 3.5 Sonnet and another instrument, Maestro, showed “sparks of artificial general intelligence?”

Anthropic staff goes to Claude 3.5 Sonnet

Despite his obvious bias, Anthropic Developer Relations Team Leader Alex Albert He posted a thread on Be written by LLMs.

Likewise, the Anthropic Technical Officer Maggie Fu posted on X That Claude 3.5 Sonnet can now do “half my job… and I couldn’t be happier.”

Click on OpenAI

Others have noted that now that Claude 3.5 Sonnet has overtaken OpenAI’s GPT-4o and become available at similar prices, the latter company is under renewed pressure to continue presenting its models as the right choice.

Professor at the Wharton School of Business at the University of Pennsylvania and an AI booster Ethan Mollick Comparing the Artifacts feature to a “simpler version of the Code Interpreter” from OpenAI’s GPT-4.

X user @Kimmonismus went furtherSaying that OpenAI will “sleep through AGI” or artificial general intelligence, the company’s stated goal of an AI model that outperforms humans at most economically valuable tasks. They criticized the company for announcing additional features with GPT-4o that had not yet shipped, including new audio methods.

Still not at human level

Despite the high praise for

Likewise, technology journalist Timothy B. Lee, better known as @binarybits on Xnoted that he “still makes stupid mistakes sometimes,” posting a screenshot asking him to answer a simple math word problem: Which is better: 100 pennies or three-quarters? Which he answered Three-quartersstarting.

However, even with these minor issues so far, Claude 3.5 Sonnet appears to represent a huge leap for Anthropic and LLMs in general, and shows that the performance gains for individual AI modellers are certainly not slowing down with current levels of available computing resources (i.e. GPUs). .

Leave a Reply

Your email address will not be published. Required fields are marked *