What's Next for AI Benchmarking: Join the Discussion Tomorrow

·Apr 01, 2025 05:00 PM

In tomorrow's paper read we'll be covering what's next for AI benchmarking. 📊 Srilakshmi C. & John G. will kick off that discussion by diving into ‘Humanity’s Last Exam’ (ARC-AGI-2). This was designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. The dataset consists of 2,700 challenging questions across over a hundred subjects. They'll also cover some recent AI news. Sign up here if you're not already on the list: https://arize.com/resource/community-papers-reading/

2 comments