This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.

This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.

Back to datasets

MedQA

MedQA: Medical Question Answering Dataset (USMLE-style)

AI-focusedPublicNeither

Specific Type: AI benchmarking
Dataset Type: Cross-sectional
Institution: Multiple institutions
Institution Type: Academia
Level of Focus: Task capability
Most Granular Level: Individual medical question level
Perspective: Neither
Time Coverage: 2020-present
Frequency: Static benchmark with extensions
Sample Size: 12723 English questions; 34251 Chinese questions
Geographic Detail: Global
Occupational Classification: Not specified
Industrial Classification: Not specified
Other Classification: Medical domain classification

Key Variables

Medical knowledge accuracy; clinical reasoning; diagnostic capability

AI/Tech Tracking

USMLE-style medical knowledge and reasoning

Access Details

Available through Papers with Code and Hugging Face

Notes

Based on US Medical Licensing Examination; covers clinical knowledge and reasoning

Key Papers

Jin et al. (2021); Various medical AI papers