Contact Info

Atlas Cloud LLC 600 Cleveland Street Suite 348 Clearwater, FL 33755 USA

support@dedirock.com

Client Area
Recommended Services
Supported Scripts
WordPress
Hubspot
Joomla
Drupal
Wix
Shopify
Magento
Typeo3

Claude 3.5 Beats Galileo in AI Hallucination Test: What It Means

The AI company Galileo has just announced its latest Hallucination Index, which is a framework that evaluates 22 leading generative AI models.

Models are tested using a metric called context adherence, which measures “closed-domain hallucinations: cases where your model said things that were not provided in the context.”

The best performing model overall for RAG, according to the ranking, is Claude 3.5 Sonnet from Anthropic. Galileo said that this model and Anthropic’s other model Claude 3 Opus had near perfect scores, beating out OpenAI’s models, which won last year.

From a cost perspective, the best performing model was Google’s Gemini 1.5 Flash. And Alibaba’s Qwen2-72B-Instruct was overall the best performing open source model, though in short context RAG tests, Meta’s llama-3-60b-instruct was the best.

Broken down by context length, the best closed-source model in short context RAG was Claude 3.5 Sonnet, in medium context RAG was Google’s Gemini-1.5-flash-001 (with cost being the tiebreaker with other models that also scored a perfect score), and in large context RAG was again Claude 3.5 Sonnet.

“In today’s rapidly evolving AI landscape, developers and enterprises face a critical challenge: how to harness the power of generative AI while balancing cost, accuracy, and reliability. Current benchmarks are often based on academic use-cases, rather than real-world applications. Our new Index seeks to address this by testing models in real-world use cases that require the LLMs to retrieve data, a common practice in enterprise AI implementations,” says Vikram Chatterji, CEO and co-founder of Galileo. “As hallucinations continue to be a major hurdle, our goal wasn’t to just rank models, but rather give AI teams and leaders the real-world data they need to adopt the right model, for the right task, at the right price.”


Welcome to DediRock, your trusted partner in high-performance hosting solutions. At DediRock, we specialize in providing dedicated servers, VPS hosting, and cloud services tailored to meet the unique needs of businesses and individuals alike. Our mission is to deliver reliable, scalable, and secure hosting solutions that empower our clients to achieve their digital goals. With a commitment to exceptional customer support, cutting-edge technology, and robust infrastructure, DediRock stands out as a leader in the hosting industry. Join us and experience the difference that dedicated service and unwavering reliability can make for your online presence. Launch our website.

Share this Post
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x