- OpenTools' Newsletter
- Posts
- 📻Stability’s New Audio AI
📻Stability’s New Audio AI
PLUS: Meta FAIR’s Latest Breakthroughs

Reading time: 5 minutes
Today we will discuss:
Sponsored: NinjaTech AI—An AI assistant combining 15+ smart agents to help you tackle any task effortlessly
Stability AI’s local audio model—Stability’s compact model works offline on phones and is trained only on royalty-free sound clips
Don’t miss this week’s Workflow Wednesday—AI security and privacy with PrivacyPolicies.com, Writer.com, and OpenAI + Gmail + Make for safer automation; now available at only $20/month!
Five breakthroughs from Meta’s FAIR—New models improve AI vision, language, robotics, and social skills
In other AI news today—OpenAI upgrades ChatGPT with GPT-4.1 models, YouTube launches Gemini AI’s ‘Peak Points’ to target ads during high viewer engagement, Microsoft tests hands-free ‘Hey, Copilot!’ voice activation in Windows 11, DeepMind unveils a new AI tool excelling at math and science problems
Saved the best for last—8 must-try AI tools
Meet Ninja, Your Agentic AI Assistant - MyNinja.AI
Crush Your To-Do List With 15+ AI Agents For Any Task
Ninja’s AI Assistant combines the best AI models and custom agents into one powerful app, giving you access to the industry's best AI tools under one subscription, with unlimited tasks.
AI agents for research, writing, coding, file analysis, image generation, and more
Access 25+ leading AI models from OpenAI, Anthropic, Google, Meta and others
APIs for builders, including our powerful Ninja Compound AI LLMs
Apps for every device including web, mobile, and desktop
Industry-leading prices thanks to our partnership with AWS
Perfect for Businesses, Game-Changing for Individuals
Whether you’re running a business or looking to accomplish more in your personal life, Ninja can help you save time and money with AI agents for any task.
Try Ninja For Free - Unlimited Plans Starting at $5/mo
Key Points
Stability’s new model can generate short stereo audio clips in under 8 seconds directly on mobile devices.
It was trained only on royalty-free audio, which may help avoid the copyright issues faced by other AI music tools.
🔊News - Stability AI has released a new model called Stable Audio Open Small, built to generate short audio clips directly on your phone. It works without an internet connection and was developed with chipmaker Arm to be lightweight and fast. The model can create around 11 seconds of stereo audio in under 8 seconds, all on-device.
🤔How’s it different from the rest? One thing that sets this model apart is how it was trained. Unlike some AI music tools that have faced backlash for using copyrighted songs, this one was trained only on royalty-free clips from Free Music Archive and Freesound. That could make it a safer choice for developers looking to avoid legal headaches.
🧐What it can and can’t do - Notably, this isn’t meant to generate full-length tracks or realistic vocals. Instead, it’s designed for short instrumental pieces like drum fills, transitions, or sound effects. Since the training data leans heavily Western, the results may vary depending on the genre. It also only understands prompts written in English for now.
🥸See also - The model is open for use by hobbyists, researchers, and small companies. But if your company makes more than $1 million a year, you’ll need a commercial license.
We’ve just launched the 19th edition of Workflow Wednesday for AI-minded professionals like you—actionable AI workflows delivered straight to your inbox.
This week’s topic: AI Security & Privacy
Key Points
Meta’s Perception Encoder improves AI vision and spatial understanding, excelling in images, videos, and language tasks.
The Locate 3D model enables robots to find objects in 3D using natural language and a large new dataset.
Collaborative Reasoner framework enhances AI social skills like empathy and teamwork, improving multi-agent reasoning by nearly 30%.
👨💻News - Meta’s Fundamental AI Research team recently unveiled five new projects that bring machines closer to human-level perception and intelligence. The focus is on improving how AI understands and interacts with the world, targeting applications like robotics, language models, and collaborative agents.
🤖Here’s the lowdown - The centerpiece is the Perception Encoder, a large-scale vision system that handles images and videos with remarkable detail. It can spot subtle things like a tiny bird or a stingray hidden under the sea floor. When paired with large language models, it also improves AI’s ability to answer visual questions, describe images, and understand spatial relationships.
Meta also released the Perception Language Model or PLM, which supports open research with versions ranging up to 8 billion parameters. To back it up, FAIR created the largest dataset of human-labeled video samples to date and introduced a benchmark called PLM-VideoBench. This helps test AI on complex video understanding tasks.
Locate 3D allows robots to find objects in 3D spaces based on natural language commands. The system uses a large new dataset, doubling existing annotated data, which could improve how robots interact with humans in everyday environments.
Furthermore, Meta’s Dynamic Byte Latent Transformer offers more efficient and robust language modeling by working at the byte level. Meanwhile, the Collaborative Reasoner framework enhances AI’s social skills such as empathy and teamwork, showing nearly 30 percent improvement on multi-agent reasoning tasks.
Together these releases highlight Meta’s strong push toward building AI that can perceive, understand, and collaborate like humans.
🙆🏻♀️What else is happening?
OpenAI brings its GPT-4.1 models to ChatGPT // As a result of this update, OpenAI is removing GPT-4.0 mini from ChatGPT for all users
YouTube announces Gemini AI feature to target ads when viewers are most engaged // The artificial intelligence feature, called "Peak Points," identifies times when videos receive elevated levels of viewer attention and packages ads to be placed after those moments
Microsoft starts testing ‘Hey, Copilot!’ in Windows // Beta testers can try hands-free access to the AI app in Windows 11
DeepMind claims its newest AI tool is a whiz at math and science problems // The new AI system will tackle problems with “machine-gradable” solutions
👩🏼🚒Discover mind-blowing AI tools
Learn How to Use AI - Starting January 8, 2025, we’re launching Workflow Wednesday, a series where we teach you how to use AI effectively. Lock in early bird pricing now and secure your spot. Check it out here
OpenTools AI Tools Expert - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)
Voicemod - AI-powered voice-changing software that can be used to modify voices in real-time
BlueWillow - An AI tool designed to help users create logos, graphics, photo-realistic scenes
Spot a bot - A tool that focuses on detecting bots on Twitter
Podfy AI - A tool that enhances the podcasting journey by simplifying transcriptions, show notes, timestamps, and more
LongShot AI - An AI-powered long-form content assistant that helps users research, generate, and optimize content
Tavily - An AI-powered research platform that helps users conduct comprehensive and accurate research

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague? |
Interested in featuring your services with us? Email us at [email protected] |