Humanity's Last Exam (HLE) Benchmark

From GM-RKB
(Redirected from HLE Benchmark)
Jump to navigation Jump to search

A Humanity's Last Exam (HLE) Benchmark is a phd-level science-focused ai evaluation benchmark that evaluates hle ai model performance on hle phd-level science questions (in domains like hle chemistry and hle biology) by OpenAI and Anthropic.