This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.

No Great Equalizer: Experimental Evidence on Productivity Effects of Generative AI Use in the UK Labor Market

Haslberger, Gingrich, Bhatia

2023SSRN Electronic Journal18 citations

Experimental evidenceInterdisciplinaryCausal

LLM / Generative AIWriting / contentJunior / entry-levelSenior / older workersGenderHuman-AI collaborationAugmentation vs. substitution

DOI: 10.2139/ssrn.4594466

Summary

Haslberger, Gingrich, and Bhatia conduct a pre-registered online randomized experiment with 1,041 UK workers to study how ChatGPT exposure affects productivity and inequality across different tasks, demographic groups, and skill levels.

Main Finding

Exposure to ChatGPT significantly improved task performance (0.41-0.92 SD) and reduced completion time (0.32-0.69 SD) across all worker groups, but benefits were greatest for younger workers, with no compression of inequality between educational or occupational groups and increased age-based inequality.

Primary Datasets

Original online experiment conducted via YouGov (July 2023, N=1,041)

Secondary Datasets

None

Key Methods: Pre-registered online randomized experiment with treatment group encouraged to use ChatGPT and control group discouraged from AI use; three text-based tasks of varying complexity; AI-assisted grading using GPT-4; intention-to-treat analysis with compliance checks
Sample Period: 2023
Geographic Coverage: UK
Sample Size: 1,041 UK working-age adults (504 treatment, 537 control)
Level of Analysis: Individual
Occupation Classification: None
Industry Classification: None

Notes

SSRN Electronic Journal [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality. [Claude classification]: Online lab experiment (not field experiment as tasks were researcher-designed, not real job duties). Used GPT-4 to grade responses, with high inter-coder reliability with human coders. Sample recruited from YouGov panel members who reported having ChatGPT accounts (5,350 from 34,211 screened). Treatment effects represent intention-to-treat (ITT) as authors could not fully control compliance. Key finding challenges emerging consensus about equalizing effects of AI - found no compression between skill/educational/occupational groups, and increased age inequality.