BrowseComp Benchmark

From GM-RKB
Jump to navigation Jump to search

A BrowseComp Benchmark is a web agent benchmark that evaluates autonomous web browsing and navigation capabilities through realistic task completion scenarios.