This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.

APEX

APEX: AI Productivity Index

AI-focusedPublicNeither

Specific Type: AI benchmarking
Dataset Type: Cross-sectional
Institution: Mercor
Institution Type: Private Data Provider
Level of Focus: Task capability; Occupation
Most Granular Level: Individual professional task level
Perspective: Neither
Time Coverage: 2025-present
Frequency: Static benchmark with periodic updates
Sample Size: 400 test cases (v1-extended) across 4 professional domains, created by 76 domain experts
Geographic Detail: Global
Occupational Classification: Professional domains (investment banking, management consulting, law, primary medical care)

Key Variables

AI task completion quality; expert-level comparison; productivity measurement across professional domains

AI/Tech Tracking

Evaluates frontier AI models on extended professional tasks that require sustained reasoning and domain expertise; graded by domain experts

Access Details

Leaderboard and results publicly available

Notes

Focuses on tasks requiring significant expert time (1-8 hours), distinguishing it from benchmarks that test quick-answer capabilities; represents the trend toward economically grounded AI evaluation

Key Papers

Vidgen et al. (2025)