- OpenTools' Newsletter
- Posts
- š„³Copilot Now Sees, Speaks, Thinks!
š„³Copilot Now Sees, Speaks, Thinks!
PLUS: OpenAIās DevDay Highlights
Reading time: 5 minutes
Key Points
Copilot can now analyze whatās on your screen and answer your questions, all while respecting your privacy.
With new reasoning and voice capabilities, Copilot can tackle complex problems and talk back in a more natural way.
š¤News - Microsoft is rolling out new Copilot features on Windows for everyone, including a tool that can understand and answer questions about whatās on your screen. They're also updating the Copilot apps for iOS, Android, Windows, and the web, giving them a warmer, more distinct style.
Additionally, Copilot is coming to WhatsApp, so you'll be able to chat with it directly, just like other bots you use on Metaās messaging platform.
āØWhat's new? The first new feature is called Copilot Vision. With this, Copilot can actually see what youāre looking at on your PC, especially when you're using Microsoft Edge. Itās part of Copilot Labs, which is an experimental program for Copilot Pro users. Copilot Vision can look at text and images on websites and answer questions like, āWhatās the recipe for the food in this picture?ā To use it, just type ā@copilotā in the Edge address bar. Microsoft says Copilot Vision is more powerful and privacy-focused than other screen analysis features, with data being deleted immediately after conversations. In this preview version, processed data like audio, images, or text isn't stored or used for model training. Just a heads up, though: it can only work on certain types of websites and wonāt function on paywalled or sensitive content, instead focusing on a pre-approved list of popular sites.
Next up, there's a feature called Think Deeper, which aims to make Copilot more versatile. It helps Copilot handle more complicated problems by using reasoning models that take a little longer but give step-by-step answers.
Lastly, there's a new feature called Copilot Voice, which lets you talk to Copilot and hear it respond with one of four synthetic voices. Itās kind of like OpenAIās Advanced Voice Mode for ChatGPTāCopilot Voice can pick up on your tone and respond accordingly, and you can even jump in while itās talking. Just keep in mind, there's a time limit on using Copilot Voice. Copilot Pro subscribers get more minutes, but the exact amount depends on demand, according to Microsoft.
š¤What's more? In addition to the new features, Copilot is getting more personalized. With a new "personalization" setting, Copilot will be able to use your past interactions and history, as well as how you use other Microsoft apps and services, to recommend ways you might want to use it.
Key Points
OpenAI announced Realtime API for real-time speech interactions and vision fine-tuning for enhancing GPT-4o with images and text.
A new model distillation feature allows developers to fine-tune smaller AI models using larger ones, enhancing performance and cost efficiency.
šØš»āš»News - At its 2024 DevDay today, OpenAI announced a bunch of new tools to encourage developers to create applications using its AI models.
š§° Here's the lowdown -
One of the new features OpenAI has rolled out is the Realtime API, which allows developers to create nearly real-time speech-to-speech experiences in their apps. It comes with six unique voices from OpenAI, which are different from the ones used in ChatGPT, and developers canāt use third-party voices to avoid copyright issues. During a briefing, Romain Huet, OpenAIās head of developer experience, showcased a demo of a trip planning app using the Realtime API. Users could have a conversation with an AI assistant about their upcoming trip to London and receive quick responses. The app could even mark restaurant locations on a map while chatting. Huet also demonstrated how the API could make phone calls to ask about ordering food for an event. While OpenAIās API canāt directly call restaurants like Google Duo, it can integrate with calling services like Twilio. Interestingly, OpenAI isnāt including automatic disclosures for calls to let people know theyāre talking to an AI, so itās up to developers to inform users.
OpenAI has also launched a new feature called vision fine-tuning in its API, allowing developers to use both images and text to enhance their GPT-4o applications. This should help improve the model's performance in understanding visual content. Thereās just one drawbackādevelopers can't upload copyrighted images, violent content, or any visuals that violate OpenAI's safety policies.
Lastly, OpenAI is rolling out a model distillation feature that lets developers use larger AI models, like o1-preview and GPT-4o, to fine-tune smaller models such as GPT-4o mini. Running these smaller models usually saves money, and this feature should help boost their performance. Along with model distillation, OpenAI is launching a beta evaluation tool, so developers can check how well their fine-tuned models are doing within the API.
šš»āāļøWhat else is happening?
Pinterest rolls out genAI tools for product imagery to advertisers // In addition, a combination of AI and automation features will help advertisers more quickly create their campaigns, with 50% less input required
Google is working on reasoning AI, chasing OpenAIās efforts // The push deepens the search giantās rivalry with OpenAI
US to award up to $100 million to boost use of AI for semiconductor materials // The goal is to reduce time needed to develop new semiconductor materials that are less resource-intensive
Anthropic hires OpenAI co-founder Durk Kingma // Kingma, who has a Ph.D. in machine learning from the University of Amsterdam, spent several years as a doctoral fellow at Google before joining OpenAIās founding team as a research scientist
š©š¼āšDiscover mind-blowing AI tools
OpenTools AI Tools Expert - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)
MidJourney Prompt Generator - A tool designed to help users quickly generate unique art styles using AI technology
SheetGPT - A Google Sheets add-on that allows users to integrate OpenAI's GPT3.5 text and image generation capabilities directly within Google Sheets
JIT - An AI-powered platform that simplifies and speeds up coding by providing tools for code generation, optimization, and collaboration
Maths.ai - An online platform that provides step-by-step solutions to any math problem using AI technology
WebWhiz - An AI-powered support agent that can be integrated into your website to provide instant, accurate responses to customer queries
Aomni - An AI-driven sales platform designed to help B2B sellers research, engage, and close more customers by automating and improving sales processes
BrieflyAI - A tool that uses artificial intelligence to provide actionable summaries, insights, and follow-up tasks from your video conference meetings
Clearmind - A personalized AI therapy platform designed to measure and elevate your emotional health
How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague? |
Interested in featuring your services with us? Email us at [email protected] |