👀Grok Gains Vision

PLUS: Google’s New AI Agent

Reading time: 5 minutes

Key Points 

  • xAI’s Grok AI now allows paid users on X to upload images and ask questions about them.

  • Elon Musk highlighted that Grok can explain jokes using the new image-understanding feature, which is still developing.

  • Musk hinted that Grok may soon handle documents like PDFs, promising rapid advancements in functionality are underway. 

🤓News - Elon Musk-owned xAI has enhanced its Grok AI model with new image-understanding capabilities, allowing paid users on the social platform X, who have access to the AI chatbot, to upload images and ask questions about them. The update was announced via the official @grok handle on Monday.

In a separate post, Musk highlighted that Grok can even interpret the meaning behind jokes using this new feature, although he noted that the functionality is still in its early stages and is expected to improve rapidly. 

Additionally, Musk hinted at plans for Grok to understand documents, responding to user criticism regarding the model's inability to handle certain file formats like PDFs. “Not for long,” he asserted, claiming that xAI aims to achieve in months what others take years to accomplish.

💁🏻‍♂️See also - X is actively enhancing its AI chatbot and paid user offerings to attract more users. Earlier this month, the platform introduced a new tool called Radar for Premium+ subscribers, designed to monitor real-time trends and provide insights into ongoing conversations, further expanding its suite of features for users.

Key Points 

  • Project Jarvis is Google’s new AI-driven tool designed to automate browser tasks like research, shopping, and bookings.

  • The AI system, optimized for Chrome, captures screenshots and interprets them to execute user commands efficiently.

♨️News - Google is reportedly developing a new “large action model” designed as a computer-using agent that can handle browser-based tasks for users, including gathering research, making purchases, and booking flights. 

Known internally as “Project Jarvis,” the tool is expected to run on an upcoming version of Google’s Gemini AI and could be previewed as soon as December. Initially, Jarvis would be available only to early testers, suggesting that a public launch is still some time away.

Notably, Jarvis is designed to work exclusively within a web browser and is specifically optimized for Chrome. This tool could automate various everyday, web-based activities by interpreting commands, taking screenshots of the user’s screen, and then executing actions such as clicking buttons or typing into fields. However, reports indicate Jarvis currently operates at a slow pace, as it “thinks” for a few seconds before each step.

👨🏻‍💻Why this matters - If successful, Project Jarvis could make AI tools more accessible, allowing even non-technical users to complete tasks with ease. By handling processes directly in the browser, it would remove the need for complex coding or APIs, letting users type simple instructions and rely on AI to do the rest. 

🙆🏻‍♀️What else is happening?

👩🏼‍🚒Discover mind-blowing AI tools

  1. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  2. DomainWoohoo - A website that helps users search for available domain names

  3. Neurons - An AI-powered tool that helps businesses optimize their creative assets, to increase conversions and improve performance

  4. Quickads - An AI-powered platform that allows users to design ads for various platforms and campaigns, including display ads

  5. Muzify - An AI-powered platform that creates personalized music playlists to accompany books

  6. UXSniff - An AI-powered tool that provides user experience insights for websites

  7. ProductScope - An AI-powered platform that helps brands and marketers optimize their Amazon listings

  8. Fliz - An AI-powered tool that automates the creation of high-quality product videos

  9. Logo Theme AI - Allows users to customize and adapt their logos for various themes, events, and occasions

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]