This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to papers

Clio: Privacy-Preserving Insights into Real-World AI Use

Tamkin, McCain, Handa, Durmus, Lovitt, Rathi, Huang, Mountfield, Hong, Ritchie, Stern, Clarke, Goldberg, Sumers, Mueller, McEachen, Mitchell, Carter, Clark, Kaplan, Ganguli

2024arXiv4 citations
Data infrastructureComputer Science / AI
LLM / Generative AIWriting / contentSoftware / codingPlatforms / gig economy
Abstract

How are AI assistants being used in the real world? While model providers in theory have a window into this impact via their users' data, both privacy concerns and practical challenges have made analyzing this data difficult. To address these issues, we present Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants themselves to analyze and surface aggregated usage patterns across millions of conversations, without the need for human reviewers to read raw conversations. We validate this can be done with a high degree of accuracy and privacy by conducting extensive evaluations. We demonstrate Clio's usefulness in two broad ways. First, we share insights about how models are being used in the real world from one million Claude.ai Free and Pro conversations, ranging from providing advice on hairstyles to providing guidance on Git operations and concepts. We also identify the most common high-level use cases on Claude.ai (coding, writing, and research tasks) as well as patterns that differ across languages (e.g., conversations in Japanese discuss elder care and aging populations at higher-than-typical rates). Second, we use Clio to make our systems safer by identifying coordinated attempts to abuse our systems, monitoring for unknown unknowns during critical periods like launches of new capabilities or major world events, and improving our existing monitoring systems. We also discuss the limitations of our approach, as well as risks and ethical concerns. By enabling analysis of real-world AI usage, Clio provides a scalable platform for empirically grounded AI safety and governance.

Summary

Tamkin, McCain, and colleagues at Anthropic develop Clio, a privacy-preserving platform that uses AI assistants to cluster and analyze millions of conversations via embedding-based clustering and LLM summarization, validating the system on synthetic data and applying it to 1M+ Claude.ai conversations to surface usage patterns and safety violations

Main Finding

Clio reconstructs topic distributions with 94% accuracy on synthetic data and reveals coding/business tasks dominate Claude.ai usage (>10% web/mobile development), with significant cross-language variation (e.g., Japanese/Chinese conversations discuss elder care at higher rates), while identifying coordinated abuse patterns invisible at individual conversation level

Primary Datasets

Claude.ai conversation logs (proprietary); WildChat dataset (public); LMSYS-1M-Chat dataset (public)

Secondary Datasets

Synthetic multilingual conversation dataset (generated for validation)

Key Methods
Privacy-preserving clustering pipeline using LLM-based summarization, embedding (all-mpnet-base-v2), k-means clustering, and hierarchical organization; manual validation of pipeline components; synthetic data reconstruction tests
Sample Period
2024
Geographic Coverage
Global (Claude.ai users)
Sample Size
1 million Claude.ai conversations for main usage analysis; 2.3 million for multilingual analysis; 500,000 for safety classifier analysis; 19,476 synthetic conversations for validation
Level of Analysis
Individual
Occupation Classification
None
Industry Classification
None
Notes
arXiv:2412.13678 [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis. [Claude classification]: This paper presents a data infrastructure tool (Clio) rather than answering a traditional empirical economics question. It uses LLMs as methodological tools to analyze LLM usage data. The paper validates its pipeline and provides descriptive insights about Claude.ai usage patterns. The 4 citations reflect its very recent publication (Dec 2024). Teams/Enterprise/API customers excluded from analysis.