- OpenTools' Newsletter
- Posts
- 👀Grok Gains Vision
👀Grok Gains Vision
PLUS: Google’s New AI Agent
Reading time: 5 minutes
Key Points
xAI’s Grok AI now allows paid users on X to upload images and ask questions about them.
Elon Musk highlighted that Grok can explain jokes using the new image-understanding feature, which is still developing.
Musk hinted that Grok may soon handle documents like PDFs, promising rapid advancements in functionality are underway.
🤓News - Elon Musk-owned xAI has enhanced its Grok AI model with new image-understanding capabilities, allowing paid users on the social platform X, who have access to the AI chatbot, to upload images and ask questions about them. The update was announced via the official @grok handle on Monday.
In a separate post, Musk highlighted that Grok can even interpret the meaning behind jokes using this new feature, although he noted that the functionality is still in its early stages and is expected to improve rapidly.
Additionally, Musk hinted at plans for Grok to understand documents, responding to user criticism regarding the model's inability to handle certain file formats like PDFs. “Not for long,” he asserted, claiming that xAI aims to achieve in months what others take years to accomplish.
💁🏻♂️See also - X is actively enhancing its AI chatbot and paid user offerings to attract more users. Earlier this month, the platform introduced a new tool called Radar for Premium+ subscribers, designed to monitor real-time trends and provide insights into ongoing conversations, further expanding its suite of features for users.
Key Points
Project Jarvis is Google’s new AI-driven tool designed to automate browser tasks like research, shopping, and bookings.
The AI system, optimized for Chrome, captures screenshots and interprets them to execute user commands efficiently.
♨️News - Google is reportedly developing a new “large action model” designed as a computer-using agent that can handle browser-based tasks for users, including gathering research, making purchases, and booking flights.
Known internally as “Project Jarvis,” the tool is expected to run on an upcoming version of Google’s Gemini AI and could be previewed as soon as December. Initially, Jarvis would be available only to early testers, suggesting that a public launch is still some time away.
Notably, Jarvis is designed to work exclusively within a web browser and is specifically optimized for Chrome. This tool could automate various everyday, web-based activities by interpreting commands, taking screenshots of the user’s screen, and then executing actions such as clicking buttons or typing into fields. However, reports indicate Jarvis currently operates at a slow pace, as it “thinks” for a few seconds before each step.
👨🏻💻Why this matters - If successful, Project Jarvis could make AI tools more accessible, allowing even non-technical users to complete tasks with ease. By handling processes directly in the browser, it would remove the need for complex coding or APIs, letting users type simple instructions and rely on AI to do the rest.
🙆🏻♀️What else is happening?
Meta signs its first big AI deal for news // The agreement will bring news-related answers to Meta’s AI chatbot, with citations linking to Reuters content
OpenAI’s Whisper transcription tool has hallucination issues, researchers say // Researchers revealed that Whisper has introduced everything from racial commentary to imagined medical treatments into transcripts
TSMC tech in Huawei’s AI chips raises questions about ‘porous’ supply chain // It remains unknown how TSMC dies found their way into Huawei’s Ascend 910B, but analysts say it shows the limits of Washington’s sanctions
👩🏼🚒Discover mind-blowing AI tools
OpenTools AI Tools Expert - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)
DomainWoohoo - A website that helps users search for available domain names
Neurons - An AI-powered tool that helps businesses optimize their creative assets, to increase conversions and improve performance
Quickads - An AI-powered platform that allows users to design ads for various platforms and campaigns, including display ads
Muzify - An AI-powered platform that creates personalized music playlists to accompany books
UXSniff - An AI-powered tool that provides user experience insights for websites
ProductScope - An AI-powered platform that helps brands and marketers optimize their Amazon listings
Fliz - An AI-powered tool that automates the creation of high-quality product videos
Logo Theme AI - Allows users to customize and adapt their logos for various themes, events, and occasions
How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague? |
Interested in featuring your services with us? Email us at [email protected] |