- Key Methods
- Preregistered randomized controlled experiment with 435 participants assigned to 122 teams (human-only, single-AI, or multiple-AI conditions) performing two professional tasks; second experiment with 139 individual-AI pairs; OLS regression analysis with team-level controls, Bayesian regression for null effects, and interaction analysis of human-AI engagement patterns
- Sample Period
- 2024
- Geographic Coverage
- US (Prolific participants restricted to US residents)
- Sample Size
- Experiment I: 435 participants in 122 teams completing 2 tasks (244 team-task observations); Experiment II: 139 individuals completing 2 tasks (278 individual-task observations)
- Level of Analysis
- Individual, Firm
- Occupation Classification
- None
- Industry Classification
- None
- Replication Package
- Partial
NotesTsinghua University Working Paper 2405.17924
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Uses ORIV (Obviously Related Instrumental Variables) methodology in robustness checks to address measurement error. Preregistered study. Task is age classification from photographs using IMDB-WIKI dataset. AI predictions come from Caffe deep learning model. Incentivized using binarized scoring rule.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.
[Claude classification]: Preregistered study (OSF: 5su8c). Uses GPT-4.0 API to assess quality of human input to AI. Human judges blind to conditions rated outputs on quality, novelty, and usefulness (Cronbach's alpha 0.68-0.77). Coarsened Exact Matching used for robustness checks. Key finding: centralized AI usage (one or few team members engaging deeply) more effective than distributed engagement in multiple-AI teams. Teams with higher IQ, familiarity, and size benefited more from multiple AIs. AI integration improved team potency and satisfaction but not coordination or information elaboration.