šŸ˜±Musk Faked Grok 3ā€™s Brilliance?

PLUS: Google, Salesforce Stir the AI Pot

Reading time: 5 minutes

Today we will discuss:

Comp AI - The Open Source Vanta/Drata Alternative -Get SOC 2, ISO 27001 and GDPR Compliant

Comp AI is developing the first open and transparent compliance automation platform, to help 100,000 companies get compliant by 2032. Their approach is helping companies automate evidence collection, manage and track risks, and have full vendor oversight - without high upfront fees and lengthy sales calls.

Enterprise-grade compliance. Simplify compliance from day one - no drawn-out demos, no six-figure commitments.

Get immediate visibility into your compliance status.

GoogleImages

Key Points 

  • An OpenAI employee accused xAI of sharing misleading data on Grok 3ā€™s AIME 2025 benchmark performance.

  • OpenAI claims xAI omitted key scoring metrics that could misrepresent how Grok 3 compares to their models.

ā˜•News - An OpenAI employee has accused Elon Muskā€™s AI company, xAI, of sharing misleading benchmark results for its latest model, Grok 3. The issue centers on claims that xAIā€™s data paints an inaccurate picture of Grok 3ā€™s capabilities. In response, xAI co-founder Igor Babushkin stood by the results, insisting that the companyā€™s reporting was fair.

šŸ¤ØWhat's the truth? In a blog post, xAI shared a graph highlighting Grok 3ā€™s performance on AIME 2025, a set of challenging math questions from a recent invitational exam. While some experts question whether AIME is the best measure of AI performance, it is still widely used to test math abilities.  

According to xAIā€™s graph, Grok 3 Reasoning Beta and Grok 3 mini Reasoning outperformed OpenAIā€™s top model, o3-mini-high. However, OpenAI employees quickly pointed out that the chart left out a key detail, the o3-mini-highā€™s score at ā€œcons@64.ā€ This setting allows the model 64 tries per question, boosting scores by taking the most common answer. Without this metric, the results can appear misleading. In fact, when looking at first-attempt scores, both Grok 3 variants fall behind OpenAIā€™s models, yet xAI continues to promote Grok 3 as the ā€œworldā€™s smartest AI.ā€

šŸ§What's the conclusion? Babushkin pushed back, arguing OpenAI has shared similarly misleading comparisons before. But as AI researcher Nathan Lambert pointed out, what remains unclear is the computational cost behind these results, a factor that could reveal far more about what these models can actually do.

GoogleImages

Key Points 

  • Salesforce will invest at least $2.5 billion over seven years to use Googleā€™s cloud services.

  • The partnership allows Salesforce customers to run AI-powered tools like Agentforce on Google Cloud.

  • The deal strengthens competition against Microsoft by integrating Googleā€™s AI with Salesforceā€™s market-leading software.

šŸ¤News - Salesforce has signed a multibillion-dollar cloud deal with Google in a strategic move to challenge Microsoftā€™s dominance in AI and productivity tools. 

The partnership is designed to attract corporate customers who currently rely on Microsoftā€™s services, combining Salesforceā€™s strengths in customer management with Googleā€™s cloud and AI capabilities.

šŸ‘Øā€šŸ’»What happens next? While Salesforce has mainly depended on Amazonā€™s cloud services, this new agreement commits the company to spend at least $2.5 billion over the next seven years on Google Cloud. This shift will give Salesforce customers the option to run key products like their customer-management software, Agentforce AI assistants, and Data Cloudā€”on Googleā€™s platform. The goal is to offer businesses more choice and deeper integration with Googleā€™s AI tools.

šŸ„øWhy this matters - This deal reflects a growing trend among tech giants to form partnerships that expand their AI offerings. While Microsoft still holds an early lead in corporate AI adoption, uptake has been slower than expected, creating an opportunity for competitors. With Salesforceā€™s market-leading software now tied more closely to Googleā€™s cloud services, the alliance strengthens both companiesā€™ positions. According to Google Cloud CEO Thomas Kurian, the collaboration will allow users to draft documents in Google Workspace, enrich them with Salesforce data, and refine proposals using Googleā€™s Gemini AI model, bringing AI-driven efficiency to business workflows.

šŸ™†šŸ»ā€ā™€ļøWhat else is happening?

In this weekā€™s Workflow Wednesday we provided AI workflows for the tasks our readers needed help with, have a question you want answered? Join here

This weekā€™s theme was AI Optimization, where we covered the topics: 

šŸ‘©šŸ¼ā€šŸš’Discover mind-blowing AI tools

  1. Learn How to Use AI - Starting January 8, 2025, weā€™re launching Workflow Wednesday, a series where we teach you how to use AI effectively. Lock in early bird pricing now and secure your spot. Check it out here

  2. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  3. Datumo - A CRM enhancement tool that ensures your CRM data is reliable, accurate, and valuable

  4. Docalysis - An AI-powered platform designed to answer questions about various document types, including PDFs, TXTs, CSVs, and DOCX files

  5. Opera Browser - A popular web browser that offers users an enhanced browsing experience with features such as a built-in ad blocker and free VPN

  6. Sonix - An advanced AI-powered platform offering transcription, translation, and analysis tools for audio and video content

  7. RhetorAI - A feedback collection tool designed to streamline the process of garnering customer insights

  8. Extrapolate - An innovative AI-powered aging app that allows users to see how they would look in the future

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]