This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to papers

Shifting Work Patterns with Generative AI

Dillon, Jaffe, Immorlica, Stanton

2025NBER Working Paper Series5 citations
Experimental evidenceCausal
LLM / Generative AIHuman-AI collaborationTraining / upskilling
Abstract

We present evidence on how generative AI changes the work patterns of knowledge workers using data from a 6-month-long, cross-industry, randomized field experiment.Half of the 7,137 workers in the study received access to a generative AI tool integrated into the applications they already used for emails, document creation, and meetings.We find that access to the AI tool during the first year of its release primarily impacted behaviors that workers could change independently and not behaviors that require coordination to change: workers who used the tool in more than half of the sample weeks spent 3.6 fewer hours, or 31% less time on email each week (intent to treat estimate is 1.3 hours) and completed documents moderately faster, but did not significantly change time spent in meetings.

Summary

Dillon, Jaffe, Immorlica, and Stanton conduct a 6-month randomized controlled trial across 66 firms with 7,137 knowledge workers to study how access to Microsoft 365 Copilot (a generative AI tool integrated into email, meetings, and writing applications) affects work patterns measured through digital telemetry data.

Main Finding

Among treated workers who used Copilot (80% of those assigned), email time decreased by 2 hours per week (17% reduction), and out-of-hours work time fell by 0.36 hours/week, but no significant changes were detected in the quantity or composition of tasks in meetings or document writing, suggesting productivity gains on individual tasks without broader task transformation.

Primary Datasets

7,137 knowledge workers (Microsoft); 6-month RCT

Secondary Datasets

None

Key Methods
Randomized controlled trial with 6-month follow-up; difference-in-differences estimation; firm-by-event-month fixed effects; instrumental variables (IV) using treatment assignment to instrument for actual Copilot use; analysis of telemetry data on time allocation and task quantities
Sample Period
2024
Geographic Coverage
US
Sample Size
7,137 workers (3,684 treated, 3,453 control) across 66 firms; up to 337,149 worker-week observations depending on outcome
Level of Analysis
Individual, Firm
Occupation Classification
SOC (Standard Occupational Classification) major groups
Industry Classification
None
Notes
NBER WP 33795; conditionally accepted AER: Insights; 2hr/wk email savings; no detectable task composition shifts [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns. [Claude classification]: NBER WP 33795; conditionally accepted at AER: Insights as of paper revision. Treatment was access to Microsoft 365 Copilot integrated into existing work applications. 90% of treated workers used Copilot at least once; 80% were regular users. Main effects concentrated in email management (Outlook); no detectable compositional shifts in tasks. Firm fixed effects explain more variation in adoption than individual pre-period behavior. No negative spillovers to coworkers detected. Used GPT-4o to classify occupations. ML heterogeneity analysis (elastic net, neural networks) found limited treatment effect heterogeneity. Network analysis used to identify close coworkers based on collaboration patterns.