šŸ„³Copilot Now Sees, Speaks, Thinks!

PLUS: OpenAIā€™s DevDay Highlights

Reading time: 5 minutes

Key Points 

  • Copilot can now analyze whatā€™s on your screen and answer your questions, all while respecting your privacy.

  • With new reasoning and voice capabilities, Copilot can tackle complex problems and talk back in a more natural way.

šŸ¤“News - Microsoft is rolling out new Copilot features on Windows for everyone, including a tool that can understand and answer questions about whatā€™s on your screen. They're also updating the Copilot apps for iOS, Android, Windows, and the web, giving them a warmer, more distinct style. 

Additionally, Copilot is coming to WhatsApp, so you'll be able to chat with it directly, just like other bots you use on Metaā€™s messaging platform.

āœØWhat's new? The first new feature is called Copilot Vision. With this, Copilot can actually see what youā€™re looking at on your PC, especially when you're using Microsoft Edge. Itā€™s part of Copilot Labs, which is an experimental program for Copilot Pro users. Copilot Vision can look at text and images on websites and answer questions like, ā€œWhatā€™s the recipe for the food in this picture?ā€ To use it, just type ā€œ@copilotā€ in the Edge address bar. Microsoft says Copilot Vision is more powerful and privacy-focused than other screen analysis features, with data being deleted immediately after conversations. In this preview version, processed data like audio, images, or text isn't stored or used for model training. Just a heads up, though: it can only work on certain types of websites and wonā€™t function on paywalled or sensitive content, instead focusing on a pre-approved list of popular sites.

Next up, there's a feature called Think Deeper, which aims to make Copilot more versatile. It helps Copilot handle more complicated problems by using reasoning models that take a little longer but give step-by-step answers.

Lastly, there's a new feature called Copilot Voice, which lets you talk to Copilot and hear it respond with one of four synthetic voices. Itā€™s kind of like OpenAIā€™s Advanced Voice Mode for ChatGPTā€”Copilot Voice can pick up on your tone and respond accordingly, and you can even jump in while itā€™s talking. Just keep in mind, there's a time limit on using Copilot Voice. Copilot Pro subscribers get more minutes, but the exact amount depends on demand, according to Microsoft.

šŸ¤–What's more? In addition to the new features, Copilot is getting more personalized. With a new "personalization" setting, Copilot will be able to use your past interactions and history, as well as how you use other Microsoft apps and services, to recommend ways you might want to use it.

Key Points 

  • OpenAI announced Realtime API for real-time speech interactions and vision fine-tuning for enhancing GPT-4o with images and text.

  • A new model distillation feature allows developers to fine-tune smaller AI models using larger ones, enhancing performance and cost efficiency.

šŸ‘ØšŸ»ā€šŸ’»News - At its 2024 DevDay today, OpenAI announced a bunch of new tools to encourage developers to create applications using its AI models.

šŸ§° Here's the lowdown - 

One of the new features OpenAI has rolled out is the Realtime API, which allows developers to create nearly real-time speech-to-speech experiences in their apps. It comes with six unique voices from OpenAI, which are different from the ones used in ChatGPT, and developers canā€™t use third-party voices to avoid copyright issues. During a briefing, Romain Huet, OpenAIā€™s head of developer experience, showcased a demo of a trip planning app using the Realtime API. Users could have a conversation with an AI assistant about their upcoming trip to London and receive quick responses. The app could even mark restaurant locations on a map while chatting. Huet also demonstrated how the API could make phone calls to ask about ordering food for an event. While OpenAIā€™s API canā€™t directly call restaurants like Google Duo, it can integrate with calling services like Twilio. Interestingly, OpenAI isnā€™t including automatic disclosures for calls to let people know theyā€™re talking to an AI, so itā€™s up to developers to inform users.

OpenAI has also launched a new feature called vision fine-tuning in its API, allowing developers to use both images and text to enhance their GPT-4o applications. This should help improve the model's performance in understanding visual content. Thereā€™s just one drawbackā€”developers can't upload copyrighted images, violent content, or any visuals that violate OpenAI's safety policies.

Lastly, OpenAI is rolling out a model distillation feature that lets developers use larger AI models, like o1-preview and GPT-4o, to fine-tune smaller models such as GPT-4o mini. Running these smaller models usually saves money, and this feature should help boost their performance. Along with model distillation, OpenAI is launching a beta evaluation tool, so developers can check how well their fine-tuned models are doing within the API.

šŸ™†šŸ»ā€ā™€ļøWhat else is happening?

šŸ‘©šŸ¼ā€šŸš’Discover mind-blowing AI tools

  1. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  2. MidJourney Prompt Generator - A tool designed to help users quickly generate unique art styles using AI technology

  3. SheetGPT - A Google Sheets add-on that allows users to integrate OpenAI's GPT3.5 text and image generation capabilities directly within Google Sheets

  4. JIT - An AI-powered platform that simplifies and speeds up coding by providing tools for code generation, optimization, and collaboration

  5. Maths.ai - An online platform that provides step-by-step solutions to any math problem using AI technology

  6. WebWhiz - An AI-powered support agent that can be integrated into your website to provide instant, accurate responses to customer queries

  7. Aomni - An AI-driven sales platform designed to help B2B sellers research, engage, and close more customers by automating and improving sales processes

  8. BrieflyAI - A tool that uses artificial intelligence to provide actionable summaries, insights, and follow-up tasks from your video conference meetings

  9. Clearmind - A personalized AI therapy platform designed to measure and elevate your emotional health

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]