This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to datasets

WildChat

WildChat: 1M ChatGPT Interaction Logs in the Wild

AI-focusedPublicWorker-side
Visit Dataset
Specific Type
AI usage "In the wild"
Dataset Type
Cross-sectional
Institution
Allen Institute for AI
Institution Type
Academia
Level of Focus
Individual conversations
Most Granular Level
Conversation level with demographic data
Perspective
Worker-side
Time Coverage
2023
Frequency
One-time static snapshot
Sample Size
1M conversations, 2.5M interaction turns, 204K unique IPs
Geographic Detail
State and country level
Occupational Classification
Not specified
Industrial Classification
Not specified
Other Classification
Geographic (state, country), Language tags
Key Variables
Conversation content, user demographics, language detection, toxicity flags, timestamps
AI/Tech Tracking
ChatGPT/GPT-4 usage patterns
Access Details
Available on Hugging Face under ODC-BY license
Notes
Contains diverse languages and some toxic content; primarily research/training focused