This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to datasetsKey Variables Conversation content, pairwise preferences, model comparisons, Elo ratings, user votes, language detection AI/Tech Tracking Multi-model LLM usage patterns and preferences Access Details Available on Hugging Face, requires agreement to terms Notes Focus on model evaluation through human preferences; contains both safe and unsafe conversations; live platform allows ongoing data collection
LLMArena/Chatbot Arena
LMSYS-Chat-1M: Large-Scale Real-World LLM Conversation Dataset
AI-focusedPublicWorker-side
Visit Dataset- Specific Type
- AI usage "In the wild"
- Dataset Type
- Cross-sectional
- Institution
- LMSYS (UC Berkeley, UC San Diego, CMU)
- Institution Type
- Academia
- Level of Focus
- Individual conversations
- Most Granular Level
- Conversation level with preference data
- Perspective
- Worker-side
- Time Coverage
- 2023-present
- Frequency
- Periodic releases
- Sample Size
- 1M conversations (210K unique IPs), 33K preference pairs
- Geographic Detail
- Global (IP-based geographic inference)
- Occupational Classification
- Not specified
- Industrial Classification
- Not specified
- Other Classification
- Language tags, Geographic (IP-based)
Key Papers
Tamkin et al. (2024); Chiang et al. (2024) - "Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference"; Zheng et al. (2024) - "LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset"