✨ChatGPT Gets GPT-4o-Powered Image Generation

PLUS: Gemini 2.5 Raises the Bar

Reading time: 5 minutes

Today we will discuss: 

We just launched the 12th edition of Workflow Wednesday for AI-minded professionals like you—delivering actionable AI workflows straight to your inbox.

This week’s topic: AI Innovation


Supercharge your workflow with Grammarly, CoWriter, and a fully automated AI-powered travel planner using Tally, Zapier & ChatGPT.

GoogleImages

Key Points 

  • The new model improves accuracy, maintaining relationships between 15-20 objects and rendering readable text more reliably.

  • Unlike diffusion models, this system builds images step by step, improving text placement and object consistency while taking slightly longer.

🤩News - OpenAI is rolling out a new feature called “Images in ChatGPT,” allowing users to generate images directly within ChatGPT using GPT-4o. This feature is available across all subscription tiers. 

For free-tier users, limits will match those of DALL-E, though OpenAI hasn’t provided exact numbers. Previously, free users could generate up to three images daily with DALL-E 3. As for DALL-E itself, OpenAI says it will remain available through a custom GPT.

🌟What's new & improved? One of the biggest improvements in this new model is its ability to maintain accuracy when handling multiple elements in an image. AI-generated images often mix up attributes—like switching colors or shapes when asked to generate multiple objects. OpenAI says this new model can correctly handle 15 to 20 objects at once, a significant jump from the typical 5 to 8 in other systems.

Text rendering has also improved. AI image generators often struggle with readable text, producing letters that appear distorted or jumbled. OpenAI spent months refining this capability, and while small text can still be tricky, overall readability is much better than before.

🤓What's driving these changes? Unlike DALL-E and most other AI image generators, which use a diffusion model to create images all at once, this system builds images sequentially—left to right, top to bottom, similar to how text is written. OpenAI suggests this method may be why it performs better at text rendering and object accuracy. The tradeoff is slightly longer generation times, but OpenAI believes the quality improvements justify the wait.

🥸Safeguards & transparency - To prevent misuse, OpenAI has included safeguards to block watermark removal, deepfake creation, and CSAM requests. While these images won’t have visible AI watermarks, OpenAI is embedding metadata using the C2PA standard to mark them as AI-generated and will have internal tools to track image origins.

GoogleImages

Key Points 

  • Google claims Gemini 2.5 outperforms OpenAI, Anthropic, and others on AI benchmarks for reasoning, coding, and problem-solving.

  • Google says Gemini 2.5’s step-by-step reasoning improves accuracy, making it better at handling complex, real-world tasks.

👨‍💻News - Google has introduced Gemini 2.5, an upgraded AI model designed to deliver better reasoning, coding, and problem-solving.

Gemini 2.5 Pro is now available in Google AI Studio and for Gemini Advanced subscribers, who can select it from the app’s model dropdown menu. The company claims its latest model outperforms rivals from OpenAI, Anthropic, xAI, and DeepSeek on key benchmarks, measuring everything from language understanding to mathematics.

🕵️‍♂️What's more? One of Gemini’s biggest strengths, according to Google, is its native multimodality—the ability to work with not just text but also images, video, audio, and code. DeepMind CEO Demis Hassabis called Gemini 2.5 Pro “an awesome state-of-the-art model,” highlighting that it now ranks first on LMArena’s leaderboard with a significant +39 ELO point lead.

The company has also announced that a 2 million-token context window is “coming soon,” allowing the model to process much larger amounts of information at once.

Google says the model’s improved performance comes from a shift toward “reasoning” AI. We’re building these thinking capabilities directly into all of our models,” the company wrote, explaining that this will lead to more capable and context-aware AI agents.

Here's a demo that highlights this in action, showing Gemini 2.5 Pro generating a fully programmed video game from a single prompt.

🙆🏻‍♀️What else is happening?

👩🏼‍🚒Discover mind-blowing AI tools

  1. Learn How to Use AI - Starting January 8, 2025, we’re launching Workflow Wednesday, a series where we teach you how to use AI effectively. Lock in early bird pricing now and secure your spot. Check it out here

  2. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  3. Amplitude - A comprehensive suite of tools designed to provide fast and easy access to customer insights at every step of their journey

  4. 15 Minutesplan - An AI-powered business plan generator designed for entrepreneurs and business owners

  5. Mage.space - An innovative online platform offering a wide array of AI-generated art styles and models for users seeking unique and customized visuals

  6. Lalal.ai - A tool designed to refine audio recordings by removing background music, vocal plosives, mic rumble, and other unwanted noises

  7. AgentGPT - An AI-powered tool that lets users deploy autonomous agents capable of completing a wide range of tasks from drafting emails to planning trips

  8. Vidnoz Headshot Generator - Allows users to create highly realistic AI-generated headshots within minutes

  9. BannerGPT - A tool that reads and comprehends your blog posts to generate compelling and relevant banner images

  10. TextCraft AI - An email management tool designed to improve productivity and streamline email communication

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]