This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to datasets

MedQA

MedQA: Medical Question Answering Dataset (USMLE-style)

AI-focusedPublicNeither
Visit Dataset
Specific Type
AI benchmarking
Dataset Type
Cross-sectional
Institution
Multiple institutions
Institution Type
Academia
Level of Focus
Task capability
Most Granular Level
Individual medical question level
Perspective
Neither
Time Coverage
2020-present
Frequency
Static benchmark with extensions
Sample Size
12723 English questions; 34251 Chinese questions
Geographic Detail
Global
Occupational Classification
Not specified
Industrial Classification
Not specified
Other Classification
Medical domain classification
Key Variables
Medical knowledge accuracy; clinical reasoning; diagnostic capability
AI/Tech Tracking
USMLE-style medical knowledge and reasoning
Access Details
Available through Papers with Code and Hugging Face
Notes
Based on US Medical Licensing Examination; covers clinical knowledge and reasoning

Key Papers

Jin et al. (2021); Various medical AI papers