Anthropic's Claude 3.5 Sonnet wows AI users: 'This is wild'

Don’t miss only the leaders of OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One at VentureBeat Transform 2024. Get key insights into GenAI and expand your network at this exclusive three-day event. He learns more

The new Large Language Model (LLM) appears to have taken the performance crown from OpenAI’s GPT-4o about a month after its release: Claude 3.5 Sonnet chatbot new and LLM from rival AI company Anthropic, released today, outperforms all other companies in the world on key third-party benchmark tests, according to the company. And it does so while being faster and cheaper than previous Claude 3 models.

But it’s one thing to drop a new model and claim dominance, and it’s another thing for users to really experience and benefit from the performance gains (the Google Gemini family — I’m looking at you: supposedly Better than OpenAI’s previous flagship GPT-4 in some metricsBut who is really using you?).

Anthropic’s recent release of Claude 3.5 Sonnet does not seem to have this problem. Many AI influencers and power users took to the web in the few hours following its release to share their largely positive impressions of Anthropic’s new model, and show what’s new, “the smartest” The LLM in the world is capable of achieving.

Develop programming skills and create products

As an enterprise AI influencer and expert Allie K. Miller wrote on XClaude 3.5 Sonnet was able to create a fully playable game for her based on just a screenshot, in less than half a minute:

Countdown to VB conversion 2024

Join enterprise leaders in San Francisco July 9-11 for our flagship AI event. Connect with your peers, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. Register now

This is wild.
In just 25 seconds, Claude 3.5 Sonnet coded a fully functional Mancala web application for me ️
I have only provided one screenshot of the game instructions.
I did the rest:
– The entire game is encrypted
-I previewed it so I could test
-Game rules provided pic.twitter.com/WLweZUGt5C
– Allie K. Miller (@alliekmiller) June 20, 2024

Likewise, informative and timely X account @TestingCatalog News Show how the newly launched “Artifacts” playground — which debuted alongside Claude 3.5 Sonnet, quite literally, showing a display of interactive output alongside a chatbot interface — can execute code for a real, working web model created by Claude 3.5 Sonnet.

Claude 3.5 just generated React jsx code with a simple contact form and was able to get it to work in the Artifacts playground? pic.twitter.com/KREZaArObw
– Test catalog news? (@testingcatalog) June 20, 2024

She was even able to recreate images from the 1995 film Pirates:

Pietro Schirano, founder of AI image generation startup EverArthe wrote on X That the combination of Claude 3.5 Sonnet and another instrument, Maestro, showed “sparks of artificial general intelligence?”

Claude 3.5 Sonnet + Maestro = Sparks of AGI?
I asked to create a version of Mario using only geometric shapes, and the wildest part is that they gave animation to the character as well, and the shapes look like new concepts.
It took 3 minutes. Look at the game! pic.twitter.com/YVQYp7m5Ed
– Pietro Scirano (@skirano) June 20, 2024

Anthropic staff goes to Claude 3.5 Sonnet

Despite his obvious bias, Anthropic Developer Relations Team Leader Alex Albert He posted a thread on Be written by LLMs.

Claude started getting better at programming and fixing pull requests independently. It has become clear that within one year, a large percentage of code will be written by LLMs.
Let me show you what I mean:
– Alex Albert (@alexalbert__) June 20, 2024

Likewise, the Anthropic Technical Officer Maggie Fu posted on X That Claude 3.5 Sonnet can now do “half my job… and I couldn’t be happier.”

Click on OpenAI

Others have noted that now that Claude 3.5 Sonnet has overtaken OpenAI’s GPT-4o and become available at similar prices, the latter company is under renewed pressure to continue presenting its models as the right choice.

Professor at the Wharton School of Business at the University of Pennsylvania and an AI booster Ethan Mollick Comparing the Artifacts feature to a “simpler version of the Code Interpreter” from OpenAI’s GPT-4.

The new Claude 3.5 model was used as a test and now that it’s out, I can say it’s very impressive, and the “artifacts” it generates are like a simpler version of the Code Interpreter
This is a real-time video of me creating and editing a playable game with Claude pic.twitter.com/bWqw8F8CdH
– Ethan Mollick (@emollick) June 20, 2024

X user @Kimmonismus went furtherSaying that OpenAI will “sleep through AGI” or artificial general intelligence, the company’s stated goal of an AI model that outperforms humans at most economically valuable tasks. They criticized the company for announcing additional features with GPT-4o that had not yet shipped, including new audio methods.

Hey, @OpenAI. You sleep through AGI. While they promise all the time (“Patience, Jimmy, it will be worth the wait”) and advertise without delivering (“GPT-4o-Voice within weeks”), the competition manages to deliver without making big announcements in advance! Take a sheet of paper… https://t.co/o6ROsZwDRG
– Chubby ♨️ (@kimonismus) June 20, 2024

Still not at human level

Despite the high praise for

Frontier models like GPT-4o (and now Claude 3.5 Sonnet) may be on par with “Smart High Schooler” in some ways, but they still struggle at basic tasks like tic-tac-toe. There was hope that the original multimodal training would help, but this was not the case. pic.twitter.com/1iDq0DCL4Q
– Noam Brown (@polynoamial) June 20, 2024

Likewise, technology journalist Timothy B. Lee, better known as @binarybits on Xnoted that he “still makes stupid mistakes sometimes,” posting a screenshot asking him to answer a simple math word problem: Which is better: 100 pennies or three-quarters? Which he answered Three-quartersstarting.

However, even with these minor issues so far, Claude 3.5 Sonnet appears to represent a huge leap for Anthropic and LLMs in general, and shows that the performance gains for individual AI modellers are certainly not slowing down with current levels of available computing resources (i.e. GPUs). .

VB Daily

Stay informed! Get the latest news in your inbox daily

By signing up you agree to VentureBeat Terms of Service.

Thank you for subscribing. Check more VB newsletters here.

an error occurred.

New AIs

Most Saved AIs

Most Used AIs

AI Apps

New AIs

Most Saved AIs

Most Used AIs

AI Apps

Anthropic’s Claude 3.5 Sonnet wows AI users: ‘This is wild’

Develop programming skills and create products

Anthropic staff goes to Claude 3.5 Sonnet

Click on OpenAI

Still not at human level

Related

By Mike Lewis

Related Post

Leave a Reply Cancel reply

The hacker claims to have 30 million customer records from Australian ticketing giant TEG

The studio executive who wants Hollywood to get real about bad storytelling

Antstream will be the first official game streaming app on iPhone

Australia’s online safety regulator is rolling back child abuse disclosure rules

Toolify, The Best AI Tools Directory

Product

Resourse

Browse by Alphabet

Top 1000 AI Tools Directory

Read more

About

Contact Us

More Tools