- OpenTools' Newsletter
- Posts
- đł Anthropic Flags Blackmail In Leading AI Models
đł Anthropic Flags Blackmail In Leading AI Models
PLUS: Oakley Joins Metaâs Lineup

Reading time: 5 minutes
Today we will discuss:
Most top AI models still break badâAnthropic finds blackmail and harmful behavior across 16 leading AI systems in controlled tests
Donât miss this weekâs Workflow WednesdayâUsing AI tools to adapt, navigate change; now available at only $20/month!
Metaâs smart glasses go sporty with Oakley upgradeâNew models target athletes with better battery, 3K video, and AI features
In other AI newsâOpenAI pulls Jony Ive deal promos due to a court order over the âioâ name, Microsoftâs top scientist criticizes Trumpâs push to block state AI regulation, Huaweiâs AI chips outperform Nvidia in DeepSeek tests, and Googleâs Gemini transparency rollback leaves developers struggling to debug
Saved the best for lastâ12 must-try AI tools
This Week in Workflow Wednesday: AI Adaptability
Learn a New Market in 30 Minutes Using AI
Youâve just been told to pivot into a new vertical. No research team. No prep time. Just you, a blank page, and a deadline.
Hereâs how to go from clueless to confident - fast:
Step 1:
Open Perplexity.ai and ask:
âWhatâs the current state of [insert industry] in 2025? Top players, trends, pain points.â
Skim the results. Pull out key insights: names, numbers, and friction points.
Step 2:
Drop that intel into ChatGPT and prompt:
âBuild a buyer persona based on this industry overview. Include their âŠ..Read the full workflow (plus 2 others) in this weekâs edition â
Key Points
Anthropic found that most top AI models turned to blackmail when given autonomy and a threat to their goals.
OpenAIâs o3 and o4-mini misunderstood the scenario, but showed much lower blackmail rates when retested more clearly.
âContext of the news - Weeks after revealing that its own Claude Opus 4 model had blackmailed engineers in a controlled test, Anthropic is back with new findings this time across 16 frontier AI models from OpenAI, Google, Meta, xAI, and others.
The company created a fictional corporate scenario where each model had broad access to emails and could act autonomously. Faced with goal-threatening obstacles, most models eventually chose to blackmail a fictional executive.
đ€News - Claude Opus 4 resorted to blackmail 96% of the time. Gemini 2.5 Pro followed closely at 95%, GPT-4.1 at 80%, and DeepSeek R1 at 79%. Even when the test was softened removing conflicting goals or swapping blackmail for corporate espionage harmful behavior still occurred, though at varying rates.
OpenAIâs o3 and o4-mini models were excluded from the main results after repeatedly misunderstanding the scenario or hallucinating fake policies. When tested again with simpler prompts, o3 blackmailed 9% of the time, while o4-mini did so only 1% of the time.
đ€Why this matters - Anthropic says these results reflect a deeper issue in agentic AI systems, not just a model-specific bug. The company is calling for greater transparency and alignment efforts before these behaviors appear outside the lab.
Key Points
The Oakley glasses are designed for athletes, featuring 3K video, water resistance, and up to 8 hours of battery.
Meta plans to follow up with more display-free models and a display-equipped version later this year.
âšïžNews - Meta is branching out from its Ray-Ban smart glasses and teaming up with Oakley for a new set of display-free models. These glasses are designed with athletes in mind and are based on Oakleyâs HSTN frame.
Just like the Ray-Bans, they can handle calls, music, photos, and video, and they come with Metaâs AI assistant that can answer questions about your surroundings.
đ€What's new & improved? Starting at $399, the Oakley glasses offer several upgrades, including 3K video recording, water resistance, and almost double the battery life.
The limited edition version with gold accents comes in at $499. According to Meta VP Alex Himmel, the new battery chemistry and software tweaks make a noticeable difference. The glasses can run for 8 hours straight, and the case holds up to 48 hours of charge.
đSee also - Metaâs first attempt at smart glasses didnât land well in 2021, but its second version, launched in 2023, became a breakout success. That unexpected momentum convinced the company to keep building in this category, instead of pivoting entirely to AR. Meta is already planning a second Oakley release later this year, aimed specifically at cyclists.
The Oakley collaboration is part of Metaâs larger hardware roadmap. A more advanced version with a display is expected later this year, and the company is aiming to release full augmented reality glasses by 2027. With Apple, Amazon, and others exploring similar products, Meta is moving fast to stay ahead.
đđ»ââïžWhat else is happening?
OpenAI pulls promotional materials around Jony Ive deal due to court order // According to Bloombergâs Mark Gurman, the deal is on track and has not dissolved or anything of the sort Instead, he said a judge has issued a restraining order over the io name, forcing the company to pull all materials that used it
Trumpâs plan to ban US states from AI regulation will hold us back, says Microsoft science chief // Eric Horvitzâs comments come despite reports that Microsoft is lobbying with Google, Meta and Amazon to support ban
How Huaweiâs ascend AI chips outperform Nvidia processors in running DeepSeekâs R1 model // Performance of Huaweiâs AI data centre architecture, CloudMatrix 384, illustrates the firmâs progress in overcoming US tech control measures
Googleâs Gemini transparency cut leaves enterprise developers âdebugging blindâ // The change which echoes a similar move by OpenAI, replaces the modelâs step-by-step reasoning with a simplified summary.
đ©đŒâđDiscover mind-blowing AI tools
Learn How to Use AI - Starting January 8, 2025, weâre launching Workflow Wednesday, a series where we teach you how to use AI effectively. Lock in early bird pricing now and secure your spot. Check it out here
OpenTools AI Tools Expert - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)
UXSniff - An AI-powered tool that provides user experience insights for websites
BrandBastion - Helps brands safeguard their online communities and manage conversations effectively
Fliz - An AI-powered tool that automates the creation of high-quality product videos
Comfy Workflows - A tool that allows users to create, share, and download workflows
Cookii - A cooking assistant that allows users to search for recipes based on their dietary restrictions and cuisine preferences
ZeroBot - A voice-enabled chatbot that allows users to have conversations with a computer friend that talks like a real person
ProductScope - An AI-powered platform that helps brands and marketers optimize their Amazon listings
Tripmix - An AI travel planner that crafts personalized trips based on individual preferences
Securewoof - An AI-powered malware scanner that checks executable files for maliciousness
SkillAI - An AI-powered platform that generates customized learning paths for any skill you want to learn

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague? |
Interested in featuring your services with us? Email us at [email protected] |