📻Stability’s New Audio AI

PLUS: Meta FAIR’s Latest Breakthroughs

Reading time: 5 minutes

Today we will discuss:

Meet Ninja, Your Agentic AI Assistant - MyNinja.AI

Crush Your To-Do List With 15+ AI Agents For Any Task

Ninja’s AI Assistant combines the best AI models and custom agents into one powerful app, giving you access to the industry's best AI tools under one subscription, with unlimited tasks.

  • AI agents for research, writing, coding, file analysis, image generation, and more

  • Access 25+ leading AI models from OpenAI, Anthropic, Google, Meta and others

  • APIs for builders, including our powerful Ninja Compound AI LLMs

  • Apps for every device including web, mobile, and desktop

  • Industry-leading prices thanks to our partnership with AWS

Perfect for Businesses, Game-Changing for Individuals

Whether you’re running a business or looking to accomplish more in your personal life, Ninja can help you save time and money with AI agents for any task.

Try Ninja For Free - Unlimited Plans Starting at $5/mo

Key Points 

  • Stability’s new model can generate short stereo audio clips in under 8 seconds directly on mobile devices.

  • It was trained only on royalty-free audio, which may help avoid the copyright issues faced by other AI music tools.

🔊News - Stability AI has released a new model called Stable Audio Open Small, built to generate short audio clips directly on your phone. It works without an internet connection and was developed with chipmaker Arm to be lightweight and fast. The model can create around 11 seconds of stereo audio in under 8 seconds, all on-device.

🤔How’s it different from the rest? One thing that sets this model apart is how it was trained. Unlike some AI music tools that have faced backlash for using copyrighted songs, this one was trained only on royalty-free clips from Free Music Archive and Freesound. That could make it a safer choice for developers looking to avoid legal headaches.

🧐What it can and can’t do - Notably, this isn’t meant to generate full-length tracks or realistic vocals. Instead, it’s designed for short instrumental pieces like drum fills, transitions, or sound effects. Since the training data leans heavily Western, the results may vary depending on the genre. It also only understands prompts written in English for now.

🥸See also - The model is open for use by hobbyists, researchers, and small companies. But if your company makes more than $1 million a year, you’ll need a commercial license.

We’ve just launched the 19th edition of Workflow Wednesday for AI-minded professionals like you—actionable AI workflows delivered straight to your inbox.

This week’s topic: AI Security & Privacy

Key Points 

  • Meta’s Perception Encoder improves AI vision and spatial understanding, excelling in images, videos, and language tasks.

  • The Locate 3D model enables robots to find objects in 3D using natural language and a large new dataset.

  • Collaborative Reasoner framework enhances AI social skills like empathy and teamwork, improving multi-agent reasoning by nearly 30%.

👨‍💻News - Meta’s Fundamental AI Research team recently unveiled five new projects that bring machines closer to human-level perception and intelligence. The focus is on improving how AI understands and interacts with the world, targeting applications like robotics, language models, and collaborative agents.

🤖Here’s the lowdown - The centerpiece is the Perception Encoder, a large-scale vision system that handles images and videos with remarkable detail. It can spot subtle things like a tiny bird or a stingray hidden under the sea floor. When paired with large language models, it also improves AI’s ability to answer visual questions, describe images, and understand spatial relationships.

Meta also released the Perception Language Model or PLM, which supports open research with versions ranging up to 8 billion parameters. To back it up, FAIR created the largest dataset of human-labeled video samples to date and introduced a benchmark called PLM-VideoBench. This helps test AI on complex video understanding tasks.

Locate 3D allows robots to find objects in 3D spaces based on natural language commands. The system uses a large new dataset, doubling existing annotated data, which could improve how robots interact with humans in everyday environments.

Furthermore, Meta’s Dynamic Byte Latent Transformer offers more efficient and robust language modeling by working at the byte level. Meanwhile, the Collaborative Reasoner framework enhances AI’s social skills such as empathy and teamwork, showing nearly 30 percent improvement on multi-agent reasoning tasks.

Together these releases highlight Meta’s strong push toward building AI that can perceive, understand, and collaborate like humans.

🙆🏻‍♀️What else is happening?

👩🏼‍🚒Discover mind-blowing AI tools

  1. Learn How to Use AI - Starting January 8, 2025, we’re launching Workflow Wednesday, a series where we teach you how to use AI effectively. Lock in early bird pricing now and secure your spot. Check it out here

  2. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  3. Voicemod - AI-powered voice-changing software that can be used to modify voices in real-time

  4. BlueWillow - An AI tool designed to help users create logos, graphics, photo-realistic scenes

  5. Spot a bot - A tool that focuses on detecting bots on Twitter

  6. Podfy AI - A tool that enhances the podcasting journey by simplifying transcriptions, show notes, timestamps, and more

  7. LongShot AI - An AI-powered long-form content assistant that helps users research, generate, and optimize content

  8. Tavily - An AI-powered research platform that helps users conduct comprehensive and accurate research

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]