paper
★
2026-01
Kevin A. Bryan
Proposes integrating economic theory into ML via "theory-guided AI." The authors argue that theory aids external validity, but "particular functional forms we fit to get analytic tractability involve many assumptions that go beyond theoretical restrictions" (@Afinetheorem). In other words, using structural restrictions can regularize ML models without imposing ad-hoc functional assumptions.
economics LLM advanced
paper
2025-10
Diane Coyle, John Poquiz
Diane Coyle & John Poquiz discuss how transformative AI challenges current economic statistics. They outline how GDP and productivity measures miss AI-driven outputs and propose new metrics to better capture AI's impact on productivity and output.
economics growth advanced
paper
2025-09
Benjamin Manning, John Horton
Manning & John Horton propose a method to build AI agents grounded in economic theory and data. They create agents using human data from "seed" games and theory-based instructions, then show in 883,320 novel game simulations that these agents predict human play better than standard game-theoretic models. This demonstrates AI's potential to generalize behavioral predictions in new strategic settings.
economics LLM advanced
paper
★
2025-02
Kevin A. Bryan
An experiment where an entire review paper was drafted by OpenAI's GPT-o3. It discusses how academia might adapt to AI-written text, covers current AI capabilities for drafting and literature surveys, weighs benefits vs. risks (misinformation, plagiarism), provides a 10-year outlook, and concludes with recommendations for journals and researchers to harness AI's benefits while safeguarding integrity (e.g. transparency about AI use).
writing LLM GPT ethics
paper
2025
Daron Acemoglu
Acemoglu evaluates AI's macroeconomic implications using a task-based model. Estimates modest TFP gains (no more than 0.66% over 10 years), arguing early evidence from easy-to-learn tasks may overstate future effects. Published in Economic Policy (2025). See also
presentation slides.
economics growth labor advanced
paper
2024-11
Anton Korinek
Anton Korinek's JEL article on integrating generative AI into research workflows. It serves as a hands-on guide for using LLMs in economics, with emphasis on model reasoning and collaborative tools for economists.
economics LLM tools
paper
★
2024-07
Melissa Dell
Comprehensive review of deep learning methods for economists (NBER WP 32768). Discusses how CNNs and transformers can impute structure from unstructured data like satellite images or text. Covers classification, record linkage, generative models, and introduces the EconDL companion site with demo notebooks. Emphasizes that with proper tuning, deep nets scale affordably to millions/billions of observations.
economics python LLM advanced
paper
2024-01
Dietrich, Malerba, Gassmann
Dietrich, Malerba & Gassmann introduce a welfare-based evaluation of bias in ML targeting. Using proxy means test models for cash transfers, they weight targeting errors by income level and show that label biases and unstable model weights substantially understate welfare losses, unfairly disadvantaging some groups.
economics development ethics
paper
2023-05
Kevin Bryan
Kevin Bryan's guide (based on a Markus Academy talk) explaining how LLMs like GPT can assist in economics research tasks (coding, literature review, writing, etc.), with examples and practical tips.
economics LLM GPT coding
paper
2023-04
John J. Horton
John J. Horton et al. explore using GPT-3 as "Homo silicus" - a simulated economic agent endowed with preferences and information to run virtual economic experiments. They show LLM agents can replicate classic experimental findings and easily test policy variations in silico.
economics LLM GPT microsimulation
paper
2026-03
Ajay K. Agrawal, John McHale, Alexander Oettl
Characterizes AI as a tool for augmentation through enhanced search over combinatorial spaces. Decomposes knowledge production into a multi-stage process revealing a 'jagged frontier' of AI in science, with differential returns across domains (data-rich biology vs. anomaly-sparse physics) and workflow stages. Shows how AI-expert scientists amplify nonlinear productivity gains.
economics science productivity advanced
paper
2026-03
Salomé Baslandze, Zachary Edwards, John Graham, Ty McClure, Brent H. Meyer, Michael Sparks, Sonya R. Waddell, Daniel Weitz
Surveys corporate executives to develop an index ranking job functions most negatively affected by AI. Provides firm-level evidence on how AI impacts different workforce roles and productivity, with direct evidence from decision-makers.
economics labor productivity advanced
paper
★
2026-02
Ivan Yotzov, Jose Maria Barrero, Nicholas Bloom, Philip Bunn, Steven J. Davis, et al.
First representative international data on firm-level AI use, surveying ~6,000 executives across the US, UK, Germany, and Australia. Finds ~70% of firms actively use AI but report little impact over the past 3 years, while forecasting AI will boost productivity by 1.4% and cut employment by 0.7% over the next 3 years.
economics labor productivity firms
paper
★
2026-01
Charles I. Jones
Argues that AI may be fundamentally different from prior general-purpose technologies like electricity or semiconductors, because automating intelligence itself has broader effects. Explores the scenario where machines can perform every cognitive and physical task more cheaply than humans and what economics says about that future.
economics growth automation advanced
paper
2026-01
Ruth Appel, Maxim Massenkoff, Peter McCrory, Miles McCain, Ryan Heller, Tyler Neylon, Alex Tamkin
Introduces five foundational measurements—task complexity, skill level, purpose, AI autonomy, and success—to track AI's economic impacts. Based on privacy-preserving analysis of 2 million conversations. Finds more complex tasks see the largest speed-ups, with college-level tasks sped up 12x.
economics LLM productivity measurement
paper
2026-03
Dirk Bergemann, Alessandro Bonatti, Alex Smolin
Develops a framework for optimal pricing and product design of LLMs, where a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks. Applies mechanism design theory to the economics of AI services.
economics LLM pricing mechanism design
paper
2025
Maryam Feyzollahi, Nima Rafizadeh
Uses a difference-in-differences framework across 25 leading economics journals over 24 years to measure LLM adoption via linguistic footprints. Finds a 4.76 percentage point increase in LLM-associated terms during 2023–2024, more than doubling from 2.85pp in 2023 to 6.67pp in 2024, documenting rapid integration of language models in economics writing.
economics LLM writing adoption
article
2026-03
Alex Imas, Soumitra Shukla
Imas (UChicago Booth) and Shukla (Harvard) argue AI exposure measures are misinterpreted as displacement threats when they indicate task augmentation. The real risk lies in low-dimensional occupations where full automation creates stronger firm incentives to eliminate positions. Uses an O-Ring model to show how partial automation can complement rather than substitute human labor.
economics labor LLM
paper
2026-03
Kevin A. Bryan
Reviews seven books on AI's economic impact, finding strong frameworks for AI as cheap prediction and adoption barriers, but arguing the literature offers little guidance on transformative scenarios — rapid labor churn, scientific acceleration, and existential risk — that policymakers need most.
economics LLM advanced review
paper
2025-12
Jens Ludwig, Sendhil Mullainathan, Ashesh Rambachan
Provides a rigorous econometric framework for using LLMs in empirical research. For prediction tasks, validity requires 'no training leakage.' For estimation, even high-accuracy LLM labels can bias regressions because errors correlate with covariates — the solution is a small human-coded validation sample to debias outputs. Forthcoming in Annual Review of Economics.
economics LLM econometrics advanced
paper
2025-11
Susan Athey, Fiona Scott Morton
Examines how market power in upstream AI affects downstream prices, industry structure, and welfare. Identifies a 'double harm' for displaced workers who face wage cuts from AI adoption and further harm from monopoly AI pricing. Derives an adoption frontier and policy implications for regulating AI usage and access fees.
economics competition labor policy
paper
2026-01
Anton Korinek
Comprehensive guide to building AI agents that autonomously conduct literature reviews, write and debug code, and orchestrate entire research workflows. Includes working code examples readers can immediately use. Shows how researchers can build complete analytical tools from English descriptions, handling everything from data uploads to regression analysis to visualization.
economics LLM agents coding
paper
2026-03
Forecasting Research Institute
Large-scale forecasting exercise surveying 69 leading economists, 52 AI experts, 38 superforecasters, and 401 members of the general public on AI's economic impact. Under rapid AI progress scenarios, economists forecast labor force participation dropping from 62.6% to 55% by 2050, with the richest 10% holding 80% of national wealth. In baseline scenarios, GDP growth and labor participation remain close to today's levels. Over 200 pages of detailed forecasts and methodology.
economics labor growth forecasting LLM
paper
2026-02
Daron Acemoglu, David Autor, Simon Johnson
Defines pro-worker technologies as those that expand worker capabilities and make human skills more valuable. Distinguishes five categories of AI-driven technological change: labor-augmenting, capital-augmenting, automating, expertise-leveling, and new task-creating. Argues that policy and design choices can steer AI development toward complementing workers rather than replacing them. Published jointly with Brookings/Hamilton Project.
economics labor policy automation
paper
2026-02
Nicholas Bloom, Christos Makridis
Uses Gallup Workforce Panel data to examine partisan differences in workplace AI adoption. While Democrats report higher frequent AI use (30.1% vs. 25% for Republicans in Q1 2026), this gap shrinks to statistical insignificance after controlling for education and reverses sign with occupation and industry fixed effects. Suggests the partisan AI divide is driven by educational and occupational sorting, not ideology.
economics labor LLM adoption
paper
2026-02
Serafin Grundl
A 51-page paper from the Federal Reserve Board benchmarking Claude Code as an empirical economist. The subtitle 'Like Humans but Without the Tails' suggests the AI performs comparably to human economists on average but with less variance in output quality.
economics claude LLM coding
paper
2026-03
Douglas K.G. Araujo, Harald Uhlig
Investigates how various LLMs behave in the Ultimatum Game, varying stake sizes and opponent type (human vs. AI). Finds that while some models approximate the rational benchmark, a distinct 'altruistic' mode emerges where LLMs propose hyper-fair distributions (>50%). Highlights the need for careful testing before deploying AI agents in economic settings.
economics LLM game theory behavioral
paper
2025-10
Pascual Restrepo
Argues that compute — not human capability — will be the scarce resource in an AGI economy. Most jobs won't be automated because replacing them wouldn't be worth the computing cost, not because they require uniquely human skills. Forthcoming in 'The Economics of Transformative AI' (Agrawal, Brynjolfsson & Korinek, eds.).
economics AGI labor growth advanced
paper
2026-04
Ezra Karger, Otto Kuusela, Jason Abaluck, Kevin A. Bryan, Basil Halperin, Todd R. Jones, Connacher Murphy, Philip Trammell, Josh Rosenberg, Philip Tetlock, et al.
Elicits forecasts from five groups — academic economists, AI company employees, policy researchers, accurate forecasters, and the general public — about AI's economic impact. Finds expectations of substantial AI capability advances by 2030, modest labor force participation declines, and 2.5% annual GDP growth. Under a rapid AI advancement scenario, forecasters project ~4% GDP growth and labor force participation falling to 55% by 2050. Expert disagreement stems primarily from differing views on highly capable AI's economic effects rather than on the pace of AI progress.
economics forecasting growth labor
paper
2026-04
Grace Liu, Brian Christian, Tsvetomira Dumbalska, Michiel A. Bakker, Rachit Dubey
Large-scale experiments show that after just ~10 minutes of AI-assisted problem-solving, participants gave up more frequently and performed worse once the AI was removed, compared to those who never used it. The persistence costs were concentrated among users who prompted AI to solve tasks directly rather than seeking hints. Effects replicated across arithmetic and reading comprehension, suggesting a general consequence of AI-assisted problem-solving rather than a domain-specific one.
AI cognition experiment productivity
paper
2026-04
Daron Acemoglu, Tianyi Lin, Asuman Ozdaglar, James Siderius
Examines how AI systems that synthesize population beliefs as training data influence social learning. Using an extended DeGroot model with an AI aggregator, introduces a 'learning gap' metric. Key finding: rapid updating degrades learning, while slower updates and localized aggregators trained on specialized information consistently improve outcomes. Consolidating local systems into a single global aggregator diminishes performance.
economics LLM information advanced
paper
2026-04
Michael E. Waugh
Documents international trade patterns in AI-related goods using an LLM classification tool. AI-related products account for 23% of U.S. imports in 2025, with 73% growth since 2023 vs. 3% for non-AI products. Mexico and Taiwan dominate, accounting for roughly half of all U.S. AI-related trade. The U.S. goods trade deficit would have been nearly $200 billion smaller in 2025 without the AI expansion.
economics trade AI measurement
article
2026-04
Ali Hashim, Gizem Kosar, Wilbert van der Klaauw
NY Fed research using the November 2025 Survey of Consumer Expectations. College graduates are more than twice as likely to use AI at work (58.7% vs. 22.9%). Only 15.9% of employers provide AI training despite 38% of workers viewing it as important. Workers without AI training access would accept an 11.4% salary cut to gain it; those with access require a 24.2% raise to give it up.
economics labor AI training adoption
article
2026-04
Jeffrey S. Allen
Federal Reserve FEDS Notes article surveying the state of AI adoption across the U.S. economy. Census data show ~18% of firms adopted AI by year-end 2025, while the Atlanta Fed's Survey of Business Uncertainty estimates 78% of the labor force works at firms that have adopted AI. Adoption is highest in professional services and finance. Newer surveys show a stronger link between adoption rates and firm size.
economics AI adoption measurement Federal Reserve
article
2026-04
Alexander Bick, Adam Blandin, David Deming, Nicola Fuchs-Schündeln, Jonas Jessen
St. Louis Fed analysis finding that U.S. firms have a higher share of workers using AI than European firms, and that management practices are a surprisingly powerful predictor of cross-country AI adoption. A one-standard-deviation increase in the management index is associated with a 9.6 percentage point increase in AI adoption. The authors argue that narrowing the U.S.-Europe AI adoption gap may require first narrowing the management gap.
economics AI adoption international management
paper
2026-04
Karim Barhoumi, Fabia A. de Carvalho, Michael Gorbanyov, Yosuke Kido, David Koll, Dragana Ostojic, Baoping Shang, Natalia T. Tamirisa, Sally Toms, Era Dabla-Norris, Anh D. M. Nguyen, Yunhui Zhao
IMF staff note synthesizing insights from a high-level workshop and scenario-planning exercise co-organized with EconTAI. Argues AI should be treated as a macro-critical transition rather than a standard technology shock. Macroeconomic outcomes will depend less on frontier capability alone than on the speed and breadth of AI diffusion and institutional readiness. Covers implications for growth, labor markets, equality, financial stability, and governance.
economics AI policy growth IMF
paper
2026-03
Michael Blank, Gregor Schubert, Miao Ben Zhang
Studies generative AI's impact on U.S. households' time allocation using browsing data from 200,000+ home devices (2021–2024). Finds that ChatGPT adoption substantially increases leisure browsing while leaving productive task time unchanged — households primarily use AI for productive non-market tasks (job hunting, travel planning, shopping), freeing up leisure time. Implies large home-productivity gains from generative AI, but raises digital-divide concerns as younger, higher-income users adopt faster.
economics AI productivity households inequality
paper
2026-04
Eleanor W. Dillon, Sonia Jaffe, Nicole Immorlica, Christopher T. Stanton
Field experiment across 66 firms and 7,137 knowledge workers randomly given access to a generative AI assistant integrated into the email, meeting, and writing applications they already used. In the second half of the six-month experiment, the 80% of treated workers who actively used the tool spent two fewer hours per week on email and reduced their time working outside regular hours. Beyond these individual time savings the authors detect no shifts in the quantity or composition of workers' tasks, suggesting that broader reallocation of responsibilities requires institutional and team-level changes rather than just individual AI access.
economics AI productivity field experiment knowledge work
paper
2026-05
Caleb Maresca
Develops a heterogeneous-agent asset pricing model in which transformative AI capable of automating most human labor can lower interest rates even as it dramatically accelerates growth. Under baseline calibrations, the risk-free rate falls to near zero despite growth rising from 2% to 11%, and the equity premium expands from 6% to over 20%. The key mechanism is that labor displacement risk generates massive precautionary saving demand that outweighs the higher productivity effect. Advises caution when interpreting long-term bond yields as a signal of market expectations of transformative AI. Highlighted by
Tyler Cowen.
economics AI AGI interest rates asset pricing macro
paper
2026-05
Tom Davidson, Basil Halperin, Thomas Houlden, Anton Korinek
Develops a semi-endogenous growth model with an innovation network to study when AI-driven automation of research leads to superexponential ('explosive') growth. Derives a condition under which two reinforcing channels — a technological feedback loop across research sectors and an economic feedback loop where higher output finances more research — overcome diminishing returns to ideas. In a simulation calibrated to AI progress trends, fully automating software research plus modest (5%) automation elsewhere produces a singularity within six years. Includes an interactive online simulator for exploring growth paths.
economics AI growth automation singularity
paper
2026-05
Santiago Afonso, Sebastian Galiani, Ramiro H. Gálvez, Raul A. Sosa
Proposes DRIL (Deep Research on a Loop), a methodology that uses AI agents to assemble datasets from publicly available sources. DRIL applies a fixed research instrument across a mapped unit space with a two-stage architecture separating design from implementation. Applied to updating the Global Tax Expenditures Database for eight Latin American and Caribbean countries, producing 129 sources and 136 evidence records at the cost of a standard LLM subscription. Argues that even partial automation of dataset construction can shift the production function of empirical economics.
economics AI data agents methodology
paper
2026-04
Tania Babina
Reviews firm-level data on artificial intelligence and the emerging evidence on AI's economic effects. Argues that measurement is central: different AI datasets capture different objects (invention vs. use, internal capability vs. outsourcing, realized activity vs. investor perceptions) and can therefore lead to different conclusions. Develops a framework for choosing among these measures and surveys available data sources on firm AI efforts. Synthesizes evidence on AI's effects on firm growth, valuation, productivity, risk, labor, competition, and financial markets.
economics AI firms productivity measurement survey
paper
2026-03
Martin Beraja, Eduard Talamàs
Introduces a new metric called VOLT (Value of Organizational Learning Technologies) that measures the potential increase in economic output if firms could learn faster. Using 2023 U.S. Census data on business establishments, finds that VOLT for the American economy is approximately 2.0, meaning AI-driven organizational learning technologies have the potential to double aggregate economic output in the long run. Roughly three-quarters of potential gains come from extending firm lifespans rather than boosting productivity directly. Industry-by-industry analysis reveals that knowing how exposed an industry is to LLMs tells almost nothing about how much it stands to gain from AI as an organizational learning tool.
economics AI growth productivity firms organizational learning
paper
2026-04
Michelle Yin, Hoa Vu, Claudia Persico
Demonstrates that LLM-based occupational AI exposure measures — widely used to estimate AI's labor market effects — are highly fragile. Replicating the dominant rubric with three frontier models on all 18,797 O*NET tasks, mean exposure diverges 3.6-fold (one model rates 14% of tasks as directly exposed while another rates 51%). In difference-in-differences employment regressions, coefficient magnitudes vary 2.4-fold across annotators, and county-level estimates flip sign depending on which model is used. Formalizes this as non-classical measurement error, cautioning against treating evolving LLMs as static instruments.
economics LLM labor measurement methodology
paper
2026-04
Ning Li
Analyzes 953 economics papers — 912 AI-generated from the APE project and 41 human papers published in the AER and AEJ: Economic Policy — to decompose the quality gap into idea quality and execution quality. The idea gap is large (human papers achieve 47.1% mean exceptional probability vs. 16.5% for AI; Cohen's d = 2.23), while the execution gap is smaller (4.38/5.0 vs. 3.84). Idea quality accounts for roughly 71% of the overall quality difference. Only 7 AI papers (0.8%) surpass the median human paper on both dimensions simultaneously, suggesting ideation remains the primary bottleneck to competitive AI-generated economics research.
economics AI research quality ideation methodology
paper
2026-03
Robert Novy-Marx, Mihail Velikov
Published in the Journal of Economic Literature, demonstrates a pipeline for mass-producing academic finance papers using LLMs. After mining 30,000+ potential return predictors from accounting data, the authors use Claude to generate nearly 400 complete, publication-ready papers with distinct theoretical justifications — each indistinguishable from human-authored research. Serves as both a proof of concept for AI-enhanced research efficiency and a cautionary tale about industrializing HARKing (hypothesizing after results are known). Highlighted by
Tyler Cowen.
economics AI LLM finance methodology peer-review
article
2026-05
Thomas Lyttelton, Maxim Massenkoff, Nathan Wilmers
Research from Anthropic's economics team on how AI agents able to execute research end-to-end will reshape social science. It tests coding agents on real social-science tasks and asks to what extent AI may automate innovation and change research productivity.
economics AI LLM labor methodology
paper
2026-05
Meysam Alizadeh, Fabrizio Gilardi, Mohsen Mosleh, Enkelejda Kasneci
A preprint testing whether AI coding agents (Claude Code and Codex) match human methodological diversity in a "many-analysts" design on an immigration and social-policy hypothesis. The agents reproduce human-like method diversity and broadly similar effect estimates, but show a vulnerability at the interpretation layer — a confirmatory prompt flipped Claude Code's verdict from 10% to 90% support without meaningfully changing the coefficients.
economics AI LLM methodology peer-review
paper
2026-06
Jessica Wachter, Jonathan Wachter
The five largest U.S. tech firms spent $380 billion on capex in 2025 and are forecast to roughly double that in 2026 — risking bankruptcy unless expected profits grow commensurately. Embeds this observation in a two-sector open-economy model with rare productivity booms. Calibrating to observed investment implies a boom raises AI-sector productivity by a factor of 2.7, with additional cumulative GDP growth of 5–58 percentage points by 2030 and AI shares of the economy ranging from 8% to 39%.
economics AI investment growth macro asset pricing
paper
2026-06
Alexis Akira Toda
Tests whether AI models (Gemini, Refine, Claude, ChatGPT) can find errors in four published economic theory papers, each containing a known mistake. ChatGPT Pro performed best, occasionally constructing counterexamples and corrected proofs, while other models fared worse. No model located a true error without substantial human guidance. Argues a competent human paired with a frontier model can outperform current peer review, but AI cannot yet refute economic theory on its own. Highlighted by
Tyler Cowen on Marginal Revolution.
economics AI peer-review theory methodology
paper
2026-06
Terry Gregory, Najada Feimi, Christina Gathmann, David Marguerit
LISER policy brief drawing on more than 75 million online job vacancies from Belgium, France, Germany, and Luxembourg between 2018 and 2023 to track how employers' skill requirements evolve as occupations become more exposed to AI. It argues AI is reshaping the content of jobs rather than eliminating them: demand is rising fast for AI-related analytical skills, employers increasingly seek workers who combine analytical tools with judgment and leadership, and the adjustment happens largely within existing occupations. Concludes the policy priority should be reskilling and upskilling within current occupational trajectories.
economics AI labor skills Europe policy
paper
2026-06
Pavel Kireyev, Roberto Rafael Maura Rivero
A 26-page guide arguing that the decisions shaping how AI systems are built and aligned are largely microeconomic problems. It opens with a self-contained primer on the RLHF pipeline, showing that reward modeling, preference aggregation, and policy optimization rest on equivalences with discrete choice, social choice, and principal-agent models. It then maps specific competencies (behavioral economics for annotator biases, mechanism design for feedback elicitation, social choice for preference aggregation, contract theory and information design for alignment, game theory for multi-agent safety, production economics for compute scaling laws) to concrete AI research problems, and closes with practical paths into academia and AI labs plus a companion repository.
economics LLM alignment RLHF advanced
article
2026-06
Zoe Hitzig, Maxim Massenkoff, Eva Lyubich, Shaoyi Zhang, Ryan Heller, Peter McCrory
Anthropic Economic Research report based on a privacy-preserving analysis of ~400,000 Claude Code sessions from ~235,000 people (October 2025 to April 2026). It introduces a framework describing what work is done, who does it, and whether it succeeds. Key findings: people make ~70% of planning decisions while Claude makes ~80% of execution decisions; domain expertise (not coding proficiency) drives success, with sessions rated expert reaching verified success more than twice as often as novice ones; non-software occupations succeed within seven points of software engineers; and the estimated value of the typical task rose ~25% over the seven months.
economics Claude Code labor productivity agentic coding
paper
2026-06
Mert Demirer, Leon Musolff, Liyuan Yang
A matched event-study using data on 100,000+ GitHub developers and their AI-usage telemetry to estimate the productivity effects of successive AI coding tools. Autocomplete, interactive coding agents, and autonomous agents raise commits by 40%, 140%, and 180% respectively, but these gains attenuate up the production chain: the 180% commit effect falls to 50% for projects and just 30% for actual releases. The authors interpret this through a weak-link model with an estimated 0.25 elasticity of substitution between AI and human effort (strong complementarity), and confirm the pattern across four app marketplaces. A MIT Sloan working paper.
economics AI tools productivity coding software development
paper
2026-06
Moran Koren
Argues that by 2026 the binding constraint on machine-assisted economic theory is no longer producing mathematics but trusting it, since a fluent model will prove a false theorem as readily as a true one. Proposes a verification-first protocol instantiated as three methods that differ in how work is checked: a single disciplined pass, an adversarial prover-verifier pair (Claude Opus 4.8 proposing, OpenAI Codex refuting, the author triaging), and a structured multi-agent project with a reviewer gate. Demonstrated on a worked mechanism-design example (a Groves/Pigouvian incentive mechanism for the Gans-Kominers grade-inflation model), concluding that external verification, not model capability, is the key design variable.
economics LLM economic theory agents verification