🫨AI To Face Its Toughest Test!

PLUS: AI Safety in Focus

Reading time: 5 minutes

Key Points 

  • "Humanity’s Last Exam" aims to challenge AI with tough questions to determine when it reaches expert level.

  • The initiative follows OpenAI's release of its o1 model, which exceeded popular reasoning tests.

  • The new exam will focus on abstract reasoning and include over 1,000 crowd-sourced questions.

👨🏻‍💻News - A group of technology experts has launched a global initiative to find the most challenging questions to test artificial intelligence systems.

The project, known as "Humanity's Last Exam," aims to identify when AI reaches an expert level. It’s being organized by the Center for AI Safety (CAIS) and the startup Scale AI, with the goal of staying relevant as AI continues to advance in the coming years.

🧐What prompted the move? This call comes just days after OpenAI revealed its newest model, OpenAI o1, which, as Dan Hendrycks, the executive director of CAIS, put it, "completely destroyed the most popular reasoning benchmarks."

For those who don’t know, Dan Hendrycks co-authored two key papers in 2021 on testing AI, covering college-level topics and advanced math. Back then, AI systems were giving almost random answers. Now, they’ve really improved—like Anthropic’s Claude models, which jumped from around 77% on the undergraduate test in 2023 to nearly 89% a year later. Because of this, those benchmarks aren’t as useful anymore.

🤔How will the new test be different? Some AI researchers think that planning and abstract reasoning are better indicators of intelligence. To reflect this, "Humanity’s Last Exam" will focus on abstract reasoning.

Hendrycks mentioned that some questions will be kept secret to avoid AI just memorizing answers. He did say, though, that the exam will feature at least 1,000 crowd-sourced questions, due by November 1, which will be tough for non-experts to answer.

Key Points 

  • The group emphasized that governments must monitor AI research labs and find ways to discuss risks without forcing companies to reveal confidential information.

  • The group suggested setting up AI safety agencies to track systems and establish guidelines, with an international body overseeing coordination.

☕News - On Monday, a bunch of top AI scientists from the US, China, Britain, Singapore, Canada, and other countries put out a statement saying they’re worried about AI getting dangerously powerful. They noted that AI might soon surpass human abilities, and if we lose control or it’s misused, the consequences could be pretty severe for everyone.

🕵🏻‍♂️What stood out? Gillian Hadfield, a legal scholar and professor at Johns Hopkins University, warned that if AI systems developed advanced capabilities today, there’s no plan in place to control them. She questioned who we would turn to if a catastrophe happened in six months and we found models improving themselves autonomously.

The group stated that governments need to keep tabs on what’s happening at AI research labs and companies in their countries. Adding that, they also need to figure out a way to discuss potential risks without forcing companies or researchers to share their proprietary information with competitors.

🤓Do they have a solution in mind? The group suggested that countries establish A.I. safety agencies to track A.I. systems within their borders. These agencies would collaborate to set guidelines and warning signs, like detecting if an A.I. system can replicate itself or deceive its creators. An international body would then oversee and coordinate these efforts.

🙆🏻‍♀️What else is happening?

👩🏼‍🚒Discover mind-blowing AI tools

  1. OpenTools AI Tools Expert  - Find the perfect AI Tool to solve supercharge your workflow. This GPT is connected to our database, so you can ask in depth questions on any AI tool directly in ChatGPT (free)

  2. SheetGPT - A Google Sheets add-on that allows users to integrate OpenAI's GPT3.5 text and image generation capabilities directly within Google Sheets

  3. MidJourney Prompt Generator - A tool designed to help users quickly generate unique art styles using AI technology

  4. Dimeadozen.ai - A platform that allows users to analyze and validate their business ideas instantly by generating detailed business reports

  5. The GPT-Who-Lived - An interactive experience utilizing GPT-3 to create unique and immersive stories that engage with popular fictional universes

  6. JIT - An AI-powered platform that simplifies and speeds up coding by providing tools for code generation, optimization, and collaboration

  7. Clearmind - A personalized AI therapy platform designed to measure and elevate your emotional health

  8. Durable - A tool that simplifies the website creation process using artificial intelligence, allowing users to build websites quickly and efficiently without deep coding knowledge

  9. Agentive - An audit automation platform that simplifies and automates your audits

How likely is it that you would recommend the OpenTools' newsletter to a friend or colleague?

Login or Subscribe to participate in polls.

Interested in featuring your services with us? Email us at [email protected]