Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality
Dell’Acqua, McFowland, Mollick, Lifshitz‐Assaf, Kellogg, Rajendran, Krayer, Candelon, Lakhani
2023Harvard Business School Working Paper Series599 citations
Experimental evidenceInterdisciplinaryCausal
LLM / Generative AIWriting / contentCreative workHuman-AI collaborationAugmentation vs. substitutionTraining / upskilling
SummaryDell'Acqua and colleagues conduct a pre-registered field experiment with 758 Boston Consulting Group consultants randomly assigned to use GPT-4 or not, testing performance on realistic consulting tasks both within and outside the AI capability frontier to understand how generative AI affects knowledge worker productivity and quality
Main FindingFor tasks within the AI capability frontier, consultants using GPT-4 completed 12.2% more tasks, worked 25.1% faster, and produced 40%+ higher quality work, with bottom-half performers improving 43% versus 17% for top-half performers; however, for tasks outside the frontier, AI users were 19 percentage points less likely to reach correct solutions despite producing higher quality recommendations
Primary Datasets
Boston Consulting Group experimental data (758 consultants, approximately 7% of global individual contributor consultants); proprietary task completion data, AI interaction logs, human and GPT-4 quality evaluations
Secondary Datasets
Psychological assessments (Big 5 personality, innovativeness, creativity, paradox mindset); demographic and tenure data; GPT-4 interaction logs (all prompts and responses)
- Key Methods
- Pre-registered randomized field experiment with three conditions (no AI, GPT-4 access, GPT-4 plus training) testing 18 tasks inside the AI frontier and 1 task outside; human and GPT-4 evaluation of outputs; analysis of prompting behaviors and AI interaction patterns
- Sample Period
- 2023
- Geographic Coverage
- International
- Sample Size
- 758 consultants completing experimental tasks; 385 in inside-frontier experiment (creative product innovation), 373 in outside-frontier experiment (business problem-solving)
- Level of Analysis
- Individual, Task
- Occupation Classification
- None
- Industry Classification
- None
NotesHarvard Business School Working Paper No. 24-013
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.
[Claude classification]: This is a landmark field experiment on LLM effects on high-skill knowledge work. The paper introduces the concept of a 'jagged technological frontier' where AI capabilities are uneven. It identifies two distinctive patterns of human-AI integration: 'Centaur' behavior (strategic division of labor between human and AI) and 'Cyborg' behavior (complete integration of workflows). The experiment used actual BCG consultants (7% of individual contributors globally, n=758) performing realistic job tasks. The paper also documents reduced idea diversity with AI use (measured via semantic similarity of outputs). Used GPT-4 both as the experimental treatment AND as an evaluator of outputs. Participants received office recognition and career implications for performance, ensuring genuine engagement.