Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?
Horton
2023NBER Working Paper 31122234 citations
AI capability / benchmarking
LLM / Generative AIHuman-AI collaborationDecision-makingAugmentation vs. substitution
AbstractNewly-developed large language models (LLM)-because of how they are trained and designed -are implicit computational models of humans-a homo silicus.LLMs can be used like economists use homo economicus: they can be given endowments, information, preferences, and so on, and then their behavior can be explored in scenarios via simulation.Experiments using this approach, derived from Charness and Rabin (2002), Kahneman, Knetsch andThaler (1986), andSamuelson andZeckhauser (1988) show qualitatively similar results to the original, but it is also easy to try variations for fresh insights.LLMs could allow researchers to pilot studies via simulation first, searching for novel social science insights to test in the real world.
SummaryHorton uses computational simulations with GPT-3 to demonstrate that large language models can qualitatively replicate findings from classic behavioral economics experiments (dictator games, fairness judgments, status quo bias, and labor substitution), proposing LLMs as "homo silicus" agents for piloting studies.
Main FindingGPT-3 text-davinci-003 successfully replicates qualitative patterns from classic experiments: it exhibits social preferences in dictator games when appropriately endowed, shows political variation in fairness judgments (82% finding price gouging unfair matches original), displays status quo bias in budget allocation, and demonstrates labor-labor substitution under minimum wages.
Primary Datasets
GPT-3 API responses (text-davinci-003, text-ada-001, text-babbage-001, text-currie-001)
- Key Methods
- Computational simulation with GPT-3 API calls; agents endowed with different preferences, political views, and beliefs; systematic variation of prompts and scenarios; comparison of AI responses to original human experimental results
- Sample Period
- 2023
- Geographic Coverage
- Not applicable (computational simulation)
- Sample Size
- Varies by experiment: 500 observations (100 agents × 5 scenarios) for status quo bias; 360 observations for minimum wage simulation; multiple API calls per scenario across experiments
- Level of Analysis
- Individual
- Occupation Classification
- None
- Industry Classification
- None
- Replication Package
- Yes
NotesNBER WP 31122; published in EC'24. Demonstrates LLMs can replicate classic economic experiments (endowment effects, status quo bias, fairness norms), proposing 'homo silicus' as a complement to homo economicus for piloting studies via simulation.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.
[Claude classification]: Published at EC'24 (ACM Conference on Economics and Computation). This is a methodological/conceptual paper proposing LLMs as 'homo silicus' - computational models of humans that can be used to pilot studies via simulation. The experiments are computational simulations using GPT-3, not experiments with human subjects. Only the most advanced GPT-3 model (text-davinci-003) successfully changes behavior based on endowed preferences; earlier models fail this test. The paper demonstrates qualitative replication of classic experiments but emphasizes that results from AI experiments require empirical confirmation with real humans. Cost: approximately $50 total for all experiments. Regression used only in minimum wage simulation (Table 1) to show effects on hired worker characteristics.