- Key Methods
- Privacy-preserving LLM-based classification of millions of Claude.ai conversations mapped to O*NET occupational tasks through hierarchical tree-based search; descriptive analysis of usage patterns across occupations, skills, wages, and automation vs. augmentation modes
- Sample Period
- 2021-2025
- Geographic Coverage
- United States
- Sample Size
- ~4 million Claude.ai conversations (1M for main task analysis Dec 16-23, 2024; 500K for skills analysis Jan 10-17, 2025; 1M for automation/augmentation Dec 16-23, 2024; 1M for model comparison Dec 15, 2024-Jan 4, 2025; 2.8M for cluster validation Nov 28-Dec 18, 2024)
- Level of Analysis
- Task, Occupation, Individual
- Occupation Classification
- 2010 SOC (Standard Occupational Classification), mapped to 2018 SOC for exposure measures
- Industry Classification
- NAICS (for robustness checks excluding information sector)
- Replication Package
- Partial
NotesarXiv:2503.04761
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: Uses both Eloundou et al. (2024) GPT-4 beta exposure measures and Handa et al. (2025) Claude-based measures including automation vs augmentation distinction. Finds employment declines concentrated in automative AI applications but not augmentative ones. Results robust to excluding computer occupations, teleworkable occupations, and information sector firms. Compensation effects minimal, suggesting wage stickiness. Sample includes 3.5-5 million workers monthly from ADP payroll data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.
[Claude classification]: This paper uses Clio (Tamkin et al., 2024), a privacy-preserving framework that uses Claude to analyze aggregated conversation patterns. The study is purely descriptive and does not make causal claims. Classification uses hierarchical tree-based search through O*NET tasks (k-means clustering with sentence embeddings). Human validation shows 86% accuracy at base O*NET level, 91.3% at middle level, 95.3% at top level. Key limitations: single platform (Claude.ai), U.S.-centric O*NET framework, cannot observe how outputs are actually used in workflows, potential overestimation from novice users. Sample: 1M conversations for main analysis (Dec 2024), additional 500K for skills analysis (Jan 2025). The paper builds on task-based framework from Autor et al. (2003) and complements exposure predictions from Webb (2019) and Eloundou et al. (2023) with actual usage data.