This site is a work in progress and has not been widely shared. Content may contain errors. Feedback is welcome.
This site is undergoing review. Some annotations were human-generated, some AI-generated — all are being verified.
Back to papers

The Value, Benefits, and Concerns of Generative AI-Powered Assistance in Writing

Li, Liang, Peng, Yin

2024Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems48 citations
Experimental evidenceComputer Science / AICausal
LLM / Generative AIWriting / contentHuman-AI collaborationAugmentation vs. substitution
Abstract

Recent advances in generative AI technologies like large language models raise both excitement and concerns about the future of human-AI co-creation in writing. To unpack people's attitude towards and experience with generative AI-powered writing assistants, in this paper, we conduct an experiment to understand whether and how much value people attach to AI assistance, and how the incorporation of AI assistance in writing workflows changes people's writing perceptions and performance. Our results suggest that people are willing to forgo financial payments to receive writing assistance from AI, especially if AI can provide direct content generation assistance and the writing task is highly creative. Generative AI-powered assistance is found to offer benefits in increasing people's productivity and confidence in writing. However, direct content generation assistance offered by AI also comes with risks, including decreasing people's sense of accountability and diversity in writing. We conclude by discussing the implications of our findings.

Summary

Li et al. conduct an online experiment with 379 U.S. Prolific workers using discrete choice methodology and randomized writing mode assignment to study how much people value ChatGPT writing assistance and how it affects their writing experience and performance in argumentative essay and creative story tasks.

Main Finding

Participants were willing to forgo $0.85 (28.3% of payment, equivalent to $1.71/hour) to receive ChatGPT content generation assistance, compared to $0.10 for editing assistance only. AI assistance increased productivity (reduced time and grammar errors) and writing confidence but decreased enjoyment, ability to self-express, sense of ownership, accountability, and writing diversity, especially when AI provided content generation rather than just editing assistance.

Primary Datasets

Experiment-generated data from 379 Prolific workers

Secondary Datasets

None

Key Methods
Online experiment with discrete choice (willingness to pay) design and randomly assigned writing modes; participants (N=379) chose between independent writing ($3) vs. AI-assisted writing ($1.5-$4.5) for argumentative essays or creative stories; probit regression to estimate WTP; linear regression controlling for demographics, writing confidence, ChatGPT familiarity to analyze perceptions and performance
Sample Period
2024
Geographic Coverage
United States
Sample Size
379 participants (183 in independent vs. human-primary treatment; 196 in independent vs. AI-primary treatment)
Level of Analysis
Individual
Occupation Classification
None
Industry Classification
None
Notes
Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pp. 1-25 [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings. [Claude classification]: This is a human-computer interaction study examining the value users place on AI writing assistance using a discrete choice willingness-to-pay methodology. Three writing modes tested: Independent (no AI), Human-Primary (AI editing only), AI-Primary (AI content generation). Used NASA TLX for cognitive load, LDA topic modeling for coherence, sentence transformers for diversity metrics. Published in CHI 2024 conference proceedings.