AI for Economists
A curated collection of resources for economists working with AI and LLMs — from research papers and courses to practical tools and coding guides.
Last updated: May 2026.
Rich vs. poor countries face different AI transitions: with few skilled workers, LMICs may benefit more from fully automated intelligence than augmentation.
Companion paper for ZeroPaper, an autonomous end-to-end research pipeline (~30 agents, 10 stages, 6 adversarial gates) producing JFQA-quality papers at ~$2 each.
Field experiment in 66 firms, 7,137 knowledge workers: GenAI access cut ~2 hrs/week of email but did not shift task composition absent institutional change.
UVM Economics bootcamp slides on agentic AI tools for research and teaching: context windows, voice files, skills, and research pipelines.
Claude Code plugin bundling skills for citation grounding, empirical integrity, systematic reviews, Zotero ops and parallel-critic manuscript revision.
ChatGPT boosts home productivity 76–176% on digital chores, but freed time goes to leisure; younger/richer households adopt faster, widening digital divide.
IMF scenario-planning exercise: AI is a macro-critical transition; outcomes depend on diffusion speed and institutional readiness, not just capability.
Management practices are a key predictor of cross-country AI adoption gaps; narrowing U.S.-Europe AI gap may require closing the management gap.
Fed survey of U.S. AI adoption: ~18% of firms adopted AI by end-2025, with highest rates in professional services and finance.
Open-source AI paper reviewer using your OpenRouter key. Costs <$2 per review, rivals Reviewer3/Refine.ink/Stanford Agentic Reviewer. No data retention.
Practical guide for academic economists on installing, configuring, and optimizing Claude Code in VS Code for R, Stata, Python, and LaTeX workflows.
Imas argues AI triggers structural change, not mass unemployment — spending shifts toward 'relational' sectors where human involvement is the value.
NY Fed: Workers value AI training at 11% of salary. College grads are 2x more likely to use AI at work, but only 16% of employers offer training.
AI-related goods now account for 23% of U.S. imports, with 73% growth since 2023. Mexico and Taiwan dominate bilateral flows.
Acemoglu et al. show that AI knowledge aggregators can improve or degrade social learning depending on speed and scope of training.
After ~10 min of AI-assisted problem-solving, users gave up more and performed worse without AI — especially those who sought direct solutions.
Elicits AI economic forecasts from five expert and public groups; disagreement centers on AI's effects, not the pace of progress.
AI reshapes economics by making empirical execution cheap — elevating the value of institutional reasoning and price theory.
AI-assisted workflow automates large-scale paper replication across 384 studies and 3,382 models from top political science journals.
In an AGI world, compute is the scarce resource — most jobs won't be automated because replacing them isn't worth the cost.
How do LLMs split the pie? Experiments in the Ultimatum Game reveal altruistic, rational, and human-mimicking modes across models.
Anthropic's official courses: API fundamentals, prompt engineering, real-world prompting, evaluations, and tool use.
Federal Reserve Board paper benchmarking Claude Code as an empirical economist — comparable to humans on average, with less variance.
Open-source CLI for AI-powered research: deep research briefs, literature reviews, paper audits, and experiment replication.
Official OpenAI plugin for Claude Code: normal review, adversarial review, and task handoff to Codex. Open source.
Two-part series plus free Claude Code skills for paper review, grant review, pre-analysis plans, and code-paper correspondence checks.
·
Browse all resources
Research Papers 38
Sort:
Proposes integrating economic theory into ML via "theory-guided AI." The authors argue that theory aids external validity, but "particular functional forms we fit to get analytic tractability involve many assumptions that go beyond theoretical restrictions" (@Afinetheorem). In other words, using structural restrictions can regularize ML models without imposing ad-hoc functional assumptions.
Diane Coyle & John Poquiz discuss how transformative AI challenges current economic statistics. They outline how GDP and productivity measures miss AI-driven outputs and propose new metrics to better capture AI's impact on productivity and output.
Manning & John Horton propose a method to build AI agents grounded in economic theory and data. They create agents using human data from "seed" games and theory-based instructions, then show in 883,320 novel game simulations that these agents predict human play better than standard game-theoretic models. This demonstrates AI's potential to generalize behavioral predictions in new strategic settings.
An experiment where an entire review paper was drafted by OpenAI's GPT-o3. It discusses how academia might adapt to AI-written text, covers current AI capabilities for drafting and literature surveys, weighs benefits vs. risks (misinformation, plagiarism), provides a 10-year outlook, and concludes with recommendations for journals and researchers to harness AI's benefits while safeguarding integrity (e.g. transparency about AI use).
Acemoglu evaluates AI's macroeconomic implications using a task-based model. Estimates modest TFP gains (no more than 0.66% over 10 years), arguing early evidence from easy-to-learn tasks may overstate future effects. Published in Economic Policy (2025). See also presentation slides.
Anton Korinek's JEL article on integrating generative AI into research workflows. It serves as a hands-on guide for using LLMs in economics, with emphasis on model reasoning and collaborative tools for economists.
Comprehensive review of deep learning methods for economists (NBER WP 32768). Discusses how CNNs and transformers can impute structure from unstructured data like satellite images or text. Covers classification, record linkage, generative models, and introduces the EconDL companion site with demo notebooks. Emphasizes that with proper tuning, deep nets scale affordably to millions/billions of observations.
Dietrich, Malerba & Gassmann introduce a welfare-based evaluation of bias in ML targeting. Using proxy means test models for cash transfers, they weight targeting errors by income level and show that label biases and unstable model weights substantially understate welfare losses, unfairly disadvantaging some groups.
Kevin Bryan's guide (based on a Markus Academy talk) explaining how LLMs like GPT can assist in economics research tasks (coding, literature review, writing, etc.), with examples and practical tips.
John J. Horton et al. explore using GPT-3 as "Homo silicus" - a simulated economic agent endowed with preferences and information to run virtual economic experiments. They show LLM agents can replicate classic experimental findings and easily test policy variations in silico.
Characterizes AI as a tool for augmentation through enhanced search over combinatorial spaces. Decomposes knowledge production into a multi-stage process revealing a 'jagged frontier' of AI in science, with differential returns across domains (data-rich biology vs. anomaly-sparse physics) and workflow stages. Shows how AI-expert scientists amplify nonlinear productivity gains.
Surveys corporate executives to develop an index ranking job functions most negatively affected by AI. Provides firm-level evidence on how AI impacts different workforce roles and productivity, with direct evidence from decision-makers.
First representative international data on firm-level AI use, surveying ~6,000 executives across the US, UK, Germany, and Australia. Finds ~70% of firms actively use AI but report little impact over the past 3 years, while forecasting AI will boost productivity by 1.4% and cut employment by 0.7% over the next 3 years.
Argues that AI may be fundamentally different from prior general-purpose technologies like electricity or semiconductors, because automating intelligence itself has broader effects. Explores the scenario where machines can perform every cognitive and physical task more cheaply than humans and what economics says about that future.
Introduces five foundational measurements—task complexity, skill level, purpose, AI autonomy, and success—to track AI's economic impacts. Based on privacy-preserving analysis of 2 million conversations. Finds more complex tasks see the largest speed-ups, with college-level tasks sped up 12x.
Develops a framework for optimal pricing and product design of LLMs, where a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks. Applies mechanism design theory to the economics of AI services.
Uses a difference-in-differences framework across 25 leading economics journals over 24 years to measure LLM adoption via linguistic footprints. Finds a 4.76 percentage point increase in LLM-associated terms during 2023–2024, more than doubling from 2.85pp in 2023 to 6.67pp in 2024, documenting rapid integration of language models in economics writing.
Imas (UChicago Booth) and Shukla (Harvard) argue AI exposure measures are misinterpreted as displacement threats when they indicate task augmentation. The real risk lies in low-dimensional occupations where full automation creates stronger firm incentives to eliminate positions. Uses an O-Ring model to show how partial automation can complement rather than substitute human labor.
Reviews seven books on AI's economic impact, finding strong frameworks for AI as cheap prediction and adoption barriers, but arguing the literature offers little guidance on transformative scenarios — rapid labor churn, scientific acceleration, and existential risk — that policymakers need most.
Provides a rigorous econometric framework for using LLMs in empirical research. For prediction tasks, validity requires 'no training leakage.' For estimation, even high-accuracy LLM labels can bias regressions because errors correlate with covariates — the solution is a small human-coded validation sample to debias outputs. Forthcoming in Annual Review of Economics.
Examines how market power in upstream AI affects downstream prices, industry structure, and welfare. Identifies a 'double harm' for displaced workers who face wage cuts from AI adoption and further harm from monopoly AI pricing. Derives an adoption frontier and policy implications for regulating AI usage and access fees.
Comprehensive guide to building AI agents that autonomously conduct literature reviews, write and debug code, and orchestrate entire research workflows. Includes working code examples readers can immediately use. Shows how researchers can build complete analytical tools from English descriptions, handling everything from data uploads to regression analysis to visualization.
Large-scale forecasting exercise surveying 69 leading economists, 52 AI experts, 38 superforecasters, and 401 members of the general public on AI's economic impact. Under rapid AI progress scenarios, economists forecast labor force participation dropping from 62.6% to 55% by 2050, with the richest 10% holding 80% of national wealth. In baseline scenarios, GDP growth and labor participation remain close to today's levels. Over 200 pages of detailed forecasts and methodology.
Defines pro-worker technologies as those that expand worker capabilities and make human skills more valuable. Distinguishes five categories of AI-driven technological change: labor-augmenting, capital-augmenting, automating, expertise-leveling, and new task-creating. Argues that policy and design choices can steer AI development toward complementing workers rather than replacing them. Published jointly with Brookings/Hamilton Project.
Uses Gallup Workforce Panel data to examine partisan differences in workplace AI adoption. While Democrats report higher frequent AI use (30.1% vs. 25% for Republicans in Q1 2026), this gap shrinks to statistical insignificance after controlling for education and reverses sign with occupation and industry fixed effects. Suggests the partisan AI divide is driven by educational and occupational sorting, not ideology.
A 51-page paper from the Federal Reserve Board benchmarking Claude Code as an empirical economist. The subtitle 'Like Humans but Without the Tails' suggests the AI performs comparably to human economists on average but with less variance in output quality.
Investigates how various LLMs behave in the Ultimatum Game, varying stake sizes and opponent type (human vs. AI). Finds that while some models approximate the rational benchmark, a distinct 'altruistic' mode emerges where LLMs propose hyper-fair distributions (>50%). Highlights the need for careful testing before deploying AI agents in economic settings.
Argues that compute — not human capability — will be the scarce resource in an AGI economy. Most jobs won't be automated because replacing them wouldn't be worth the computing cost, not because they require uniquely human skills. Forthcoming in 'The Economics of Transformative AI' (Agrawal, Brynjolfsson & Korinek, eds.).
Elicits forecasts from five groups — academic economists, AI company employees, policy researchers, accurate forecasters, and the general public — about AI's economic impact. Finds expectations of substantial AI capability advances by 2030, modest labor force participation declines, and 2.5% annual GDP growth. Under a rapid AI advancement scenario, forecasters project ~4% GDP growth and labor force participation falling to 55% by 2050. Expert disagreement stems primarily from differing views on highly capable AI's economic effects rather than on the pace of AI progress.
Large-scale experiments show that after just ~10 minutes of AI-assisted problem-solving, participants gave up more frequently and performed worse once the AI was removed, compared to those who never used it. The persistence costs were concentrated among users who prompted AI to solve tasks directly rather than seeking hints. Effects replicated across arithmetic and reading comprehension, suggesting a general consequence of AI-assisted problem-solving rather than a domain-specific one.
Examines how AI systems that synthesize population beliefs as training data influence social learning. Using an extended DeGroot model with an AI aggregator, introduces a 'learning gap' metric. Key finding: rapid updating degrades learning, while slower updates and localized aggregators trained on specialized information consistently improve outcomes. Consolidating local systems into a single global aggregator diminishes performance.
Documents international trade patterns in AI-related goods using an LLM classification tool. AI-related products account for 23% of U.S. imports in 2025, with 73% growth since 2023 vs. 3% for non-AI products. Mexico and Taiwan dominate, accounting for roughly half of all U.S. AI-related trade. The U.S. goods trade deficit would have been nearly $200 billion smaller in 2025 without the AI expansion.
NY Fed research using the November 2025 Survey of Consumer Expectations. College graduates are more than twice as likely to use AI at work (58.7% vs. 22.9%). Only 15.9% of employers provide AI training despite 38% of workers viewing it as important. Workers without AI training access would accept an 11.4% salary cut to gain it; those with access require a 24.2% raise to give it up.
Federal Reserve FEDS Notes article surveying the state of AI adoption across the U.S. economy. Census data show ~18% of firms adopted AI by year-end 2025, while the Atlanta Fed's Survey of Business Uncertainty estimates 78% of the labor force works at firms that have adopted AI. Adoption is highest in professional services and finance. Newer surveys show a stronger link between adoption rates and firm size.
St. Louis Fed analysis finding that U.S. firms have a higher share of workers using AI than European firms, and that management practices are a surprisingly powerful predictor of cross-country AI adoption. A one-standard-deviation increase in the management index is associated with a 9.6 percentage point increase in AI adoption. The authors argue that narrowing the U.S.-Europe AI adoption gap may require first narrowing the management gap.
IMF staff note synthesizing insights from a high-level workshop and scenario-planning exercise co-organized with EconTAI. Argues AI should be treated as a macro-critical transition rather than a standard technology shock. Macroeconomic outcomes will depend less on frontier capability alone than on the speed and breadth of AI diffusion and institutional readiness. Covers implications for growth, labor markets, equality, financial stability, and governance.
Studies generative AI's impact on U.S. households' time allocation using browsing data from 200,000+ home devices (2021–2024). Finds that ChatGPT adoption substantially increases leisure browsing while leaving productive task time unchanged — households primarily use AI for productive non-market tasks (job hunting, travel planning, shopping), freeing up leisure time. Implies large home-productivity gains from generative AI, but raises digital-divide concerns as younger, higher-income users adopt faster.
Field experiment across 66 firms and 7,137 knowledge workers randomly given access to a generative AI assistant integrated into the email, meeting, and writing applications they already used. In the second half of the six-month experiment, the 80% of treated workers who actively used the tool spent two fewer hours per week on email and reduced their time working outside regular hours. Beyond these individual time savings the authors detect no shifts in the quantity or composition of workers' tasks, suggesting that broader reallocation of responsibilities requires institutional and team-level changes rather than just individual AI access.
Courses & Learning 15
Sort:
A free online course (MA/PhD level) taught by Anton Korinek. Covers: the nature of intelligence and information (Week 1); modeling technological progress with AI (Week 2); AI's impact on economic growth, including scenarios like super-exponential "singularity" growth (Week 3); implications for labor markets and inequality (Week 4); and policy responses in the Age of AI (Week 5).
An open course by Gabor Bekes for incorporating AI into data analysis workflow. Assumes basic econometrics knowledge and teaches how to use LLMs for coding and research. Weekly modules include: coding with AI (Week 0), LLM Review (Week 1), text-as-data (Weeks 5-6), and AI for research including regression controls, IV, diff-in-diff (Weeks 9-11). Open-source with assignments and an AI glossary.
An NBER workshop focusing on transformative AI. Researchers presented work on long-term AI impacts - from AI-driven growth models to AI's effect on labor and innovation policy. (Organized by Anton Korinek.)
Details Kevin Bryan's innovative use of AI in economics education. In 2023 he developed AI-based "virtual TAs" that answer student questions and generate adaptive quiz questions, greatly improving engagement. Also details his project with Joshua Gans (All Day TA) to personalize learning via AI tutors.
A credential consisting of several Stanford courses covering core AI concepts. Aimed at professionals who want structured, high-quality education in AI without committing to a full degree.
Introductory slide deck/webinar explaining how generative models work, plus basics of prompting, fine-tuning, and ethical concerns. Valuable for academics new to AI.
By Afshine & Shervine Amidi (Adjunct Lecturers at Stanford). Full course (20+ lectures) covering NLP's deep learning evolution: word embeddings, attention, the Transformer architecture, pre-training vs. fine-tuning, RLHF as used in ChatGPT, and strategies for deploying LLMs efficiently. Balances theory with practical insights. Course website: cme295.stanford.edu/syllabus
Free, hands-on course on building autonomous AI agents, from basics through multi-step reasoning agents that combine LLM reasoning with real-world tool use.
3-hour crash course covering how autonomous AI agents work: the Perception-Decision-Action loop, examples like AlphaGo and AutoGPT, and building a simple ReAct-style agent. Slides, colab notebook, and YouTube recording available.
Comprehensive, frequently updated resource hub with papers, demos, tutorials, interview prep, and open-source implementations for generative AI. Structured as an "Awesome List" with extra guidance.
Clear, visual explanations of statistics and ML concepts. "Statistics, Machine Learning, Data Science, and AI seem like very scary topics, but since each technique is really just a combination of small and simple steps, they are actually quite simple." Great for building intuition on regression, p-values, neural nets, gradient boosting, etc.
Overview of OpenAI's o1 release with advanced features for building AI agents: extended context windows, improved function calling, and Vision+Voice unified in agents.
Explains tokenization, next-token probabilities, and sampling to demystify how LLMs generate text. Covers greedy vs. temperature sampling and why LLMs sometimes repeat or err.
Anthropic's official educational repository with five hands-on courses: API Fundamentals, Prompt Engineering Interactive Tutorial, Real World Prompting, Prompt Evaluations, and Tool Use. All courses are Jupyter notebooks with practical exercises.
Two-session bootcamp at the University of Vermont introducing economists to agentic AI tools for research and teaching. Session 1 (Apr 22) covers the difference between chat-based and agentic tools, context windows, standing instructions, voice files, and reusable skills, with live demos of website redesign and custom skill creation. Session 2 (Apr 27) focuses on research pipelines and automation — code and paper auditing, lecture-building, and web scraping workflows. Materials include downloadable PDF slide decks, a setup/prep guide, an Applications & Skills page with custom Claude Code skills (e.g., /code-review, /econ-audit), and a curated outside-resources list.
Coding with AI 20
Sort:
VS Code extension for Stata: syntax highlighting, snippets, and running Stata code from the editor. Lets economists enjoy VS Code's features (multi-cursor editing, Git integration) and AI coding assistants while working on data analysis in Stata.
A practical introduction for economists who want to use AI agents for literature review, coding, data work, replication, writing, and slides without needing an enterprise-sized budget. Part of the AI MBA platform. See also the related VoxDev talk">VoxDev talk.
Ongoing series of 34+ walkthroughs and explanations of using Claude Code for quantitative and empirical social science projects. By the author of Causal Inference: The Mixtape. Topics include: workflow optimization with Deming's zero-error philosophy, comparative audits of causal inference packages (Callaway & Sant'Anna), using Claude Code for cannabis research replication, mobile "Dispatch" workflows, and philosophical reflections on AI-assisted discovery. A rich, practitioner-oriented resource for economists adopting Claude Code.
Guide on how to build an agentic workspace using Cursor and Claude Code, aimed at non-technical people.
Guide on using Claude Code 2.0 and getting better at using coding agents.
Pedro Sant'Anna's personal Claude Code workflow and setup guide.
Guide on using Claude Code with Stata for economics research.
Guide on using Git with Claude Code for economists.
Comprehensive resource guide covering the "hidden curriculum" of academia. Includes dedicated sections on Using AI, LLMs, Claude Code and Cursor (with links to Golub's AI tools overview, Claude Code/Cursor workflows, NotebookLM, AI agents for research) and Claude Code Skills (presentations, posters, feedback systems). Also covers writing, publishing/refereeing, workflow/tables/graphs, presentations, productivity, coding, and stress management. A one-stop hub for PhD students navigating modern academic research.
Step-by-step case study on building an app with Claude Code as a pair programmer, showing an iterative prompt-test-debug loop. Inspiring for people who have ideas but limited coding skills.
Thread on using agents (e.g., GPT-Engineer/Manus) to build projects without being a traditional programmer. Main point: being non-technical is no longer a barrier. Focus on problem descriptions and let AI handle implementation.
Anthropic's official guide with example prompts for Claude. Illustrates techniques like setting role/tone, providing sufficient context, and using few-shot prompting. Each example includes an explanation of why it works well.
Community-created doc on how to optimally provide context to an LLM. Covers strategies for system vs. user message, how to front-load important details, methods to chunk information, and tricks like using delimiters to anchor model attention.
Summary of Andrew Ng's project using an AI agent as a reviewer for research papers. The agent follows a reviewing protocol - reading, checking completeness, critiquing each section. Results matched expert reviewers ~60-70% of the time on decisions.
A gentle introduction to using Claude Code for academics. Includes presentation slides, an "Editor" persona for academic writing, and a curated collection of Claude Code tools and workflows for research.
Political economist Chris Blattman (UChicago Harris) documents AI tools and workflows for knowledge workers — from chatbot prompting to advanced Claude Code automation. Includes downloadable skills, templates, and case studies for managing scheduling, email, and research without coding.
Markus Academy mini-series on using Claude Code for applied economics research (5 of 7 episodes released). Part 1: Getting Started. See also Part 2: Data Analysis, Part 3: Web Scraping, Part 4: Large Datasets and Structured Databases, and Part 5: Writing & Thinking.
Two-part series on using AI to review and improve academic papers. Part 1 covers using Cursor and Claude Code for writing, editing, and getting referee-style feedback. Part 2 introduces free Claude Code skills for structured paper review, pre-analysis plan review, grant review, and code–paper correspondence checking. All tools available on GitHub.
Official OpenAI plugin that brings Codex into Claude Code workflows. Three commands: /codex:review for a read-only code review, /codex:adversarial-review for a steerable challenge review, and /codex:rescue to hand work off to Codex for a second pass from a different agent. Open source.
Practical guide to using Claude Code inside VS Code for academic economic research. Covers installation and setup, recommended extensions for Stata, Python, R, and LaTeX, file format handling, project customization via CLAUDE.md and reusable skill commands, Git integration, and context window management. Tailored to empirical economics workflows such as robustness checks, panel regressions, and research feedback automation.
AI Tools for Research 12
Sort:
An experimental system that generates a Wikipedia-like report on any topic with the help of AI. Input a topic and STORM will perform web searches, gather information, and interactively help curate it into a structured article with citations. Uses retrieval-augmented generation. Open-source on GitHub">GitHub.
An AI "referee" for academic writing. Upload a draft and Refine returns a report highlighting issues with correctness, clarity, and consistency - e.g. pointing out if a result doesn't follow from the methodology, or if a term is used inconsistently. Built by Yann Calvó López and Benjamin Golub. "Refine devotes hours of compute to help you find and fix the issues that matter most to readers and reviewers."
An experimental agent that acts as a conference reviewer. Input a paper or abstract and it produces a referee report: summarizing contributions, listing strengths and weaknesses, and giving a recommendation. Uses chain-of-thought prompting to emulate how a human reviewer would summarize and then critique.
"Talk to Scholar" feature allowing natural language questions against scholarly literature. Ask a research question and it will synthesize findings from papers with citations. Uses LLMs fine-tuned on academic text combined with Google's vast index of publications.
A free tool by Ought that uses language models to help with literature review and Q&A. Ask a research question and it will search academic papers, summarize key findings, extract relevant data or coefficients. Also features paper similarity search and PDF summarization.
Brings AI capabilities to Stata through the Model Context Protocol (MCP), enabling Claude and other AI assistants to execute Stata commands, run .do files, and interpret economic data directly from code editors like VS Code and Cursor. Supports paper replication, hypothesis testing, and econometrics learning workflows.
Open-source AI referee reports for academic papers. No account needed — you pay the API cost directly (typically under $2 per review, 20+ detailed comments). Blind-evaluated against refine.ink, Stanford Agentic Reviewer, and reviewer3.com; scores higher on coverage, specificity, and depth. MIT licensed. See also GitHub.
Open-source CLI tool for AI-powered research workflows. Supports deep research briefs with citations, literature reviews, paper auditing against codebases, and experiment replication. Uses multiple research agents (Researcher, Reviewer, Writer, Verifier) with AlphaXiv integration for paper search.
Develops an AI-assisted workflow (using Claude Code and ChatGPT) that automates full-paper replication at scale — retrieving materials, reconstructing environments, executing code, and matching outputs to reported estimates. Applied to 384 studies (3,382 models) from top political science journals, finding reproducibility rates rose from 29.6% to 79.8% after data archiving mandates.
Open-source, not-for-profit AI paper reviewer. Plug in a paper, your OpenRouter API key, and email; reviews cost under $2 using SOTA models (Claude Opus, GPT-5.4) or under $1 with open-source models. Returns an interactive panel with major and minor comments traceable to the text and tickable when addressed. Benchmarks at coarse.ink/compare (judged by Gemini-3.1-pro) rival Reviewer3, Refine.ink, and Stanford Agentic Reviewer. Reviews active for 90 days, no data retention, explicit opt-out of training on all model calls.
A Claude Code plugin for academic research workflows. Bundles eight skills covering MCP-grounded citations, empirical integrity checks, systematic literature reviews, Zotero integration (via Better BibTeX), and parallel-critic manuscript revision. Installs from inside Claude Code via /plugin marketplace add mronkko/claude-academic-research; an interactive /setup wizard configures Zotero and metadata-source credentials. Cross-platform (Windows, macOS, Linux).
Companion paper for ZeroPaper (https://github.com/alejandroll10/zeropaper), an end-to-end autonomous research pipeline that takes a domain as input and produces a paper of roughly JFQA quality with no human in the loop between launch and finish. The system is built on three premises — state and control flow live outside the model, every stage is verified rather than trusted, and termination is mechanical — and coordinates ~30 specialized agents across 10 stages and 6 adversarial gates (math audit, novelty check, mechanism review, simulated refereeing) under any of three host runtimes (Claude Code, Codex, or Gemini CLI). The paper sets out the design discipline that makes a pipeline this long terminate without drift: ten premises about LLM behavior and six derived principles. Cost is roughly $2/paper amortized under a flat-fee max subscription (~100 papers/month), versus ~1000× more on pay-per-token APIs. Released under a custom share-alike research-use license — free for non-commercial research and education, but submitting any pipeline-produced work to a journal, preprint server, conference, or thesis committee requires prior written notice, and outputs carry a non-cosmetic provenance watermark.
Talks & Videos 9
Sort:
A high-profile AEA session featuring economists discussing how AI is affecting labor markets. Topics included new evidence on AI-induced job polarization, the impact of generative AI on programmer productivity, and firms' adoption of AI. Early findings suggest AI can increase productivity within certain high-skill jobs but might enable broader automation of routine cognitive tasks. Friday, Jan 3, 2026, 10:15 AM - 12:15 PM (EST).
AEA 2026 panel focusing on LLM applications in economic research. Presentations covered using LLMs to parse legal and regulatory text, LLM-based agents for conducting simulated experiments (as in Homo Silicus), and improvements in multilingual models for development economics. Sunday, Jan 5, 2026, 10:15 AM - 12:15 PM (EST).
Benjamin Golub overviews AI tools that can accelerate research. Part 1: introduces Cursor, an AI-enhanced code editor. Part 2: discusses agents and custom tools like Refine.ink for draft review. Emphasizes prompting techniques and "low-hanging" uses of AI. (Markus Academy Episode 154)
Hands-on Cursor IDE demo showing AI-assisted coding for economic research. Live-codes an example showing how to ask the LLM to generate boilerplate, explain errors, and explore model variations.
Refine.ink demo showing AI-generated referee-style feedback on a draft (clarity, consistency, missing citations), with emphasis on human judgment. Showcases how AI can assist in the evaluation stage of research.
Markus Academy lecture where Kevin Bryan demonstrates practical ways GPT-4 can augment economic research: debugging Stata code, summarizing literature, checking proofs, with cautions about verification.
Annual meeting at Cornell (June 16-17) fostering interaction between computer science and economics, with emphasis on AI/ML. Keynotes by David Blei, Annie Liang, Sendhil Mullainathan, Aaron Roth, and Stefan Wager. Co-chaired by Francesca Molinari and Éva Tardos.
Three-day NBER Summer Institute session (July 22-24) on digital economics and AI, organized by Erik Brynjolfsson, Avi Goldfarb, and Catherine Tucker. Held in Cambridge, MA and streamed on YouTube. One of the premier venues for presenting frontier AI economics research.
Tyler Cowen makes the case for integrating AI into higher education and argues college classes should devote significant time to learning how to use AI. Discusses the future of writing and thinking in academia, Cowen's solution to cheating concerns, and whether there's value to education designed to help students become who they want to be rather than ensure mastery of a subject. Cowen also shares how he personally has adapted to AI in his own workflow.
Commentary & Analysis 33
Sort:
A Brookings panel moderated by Anton Korinek with David Autor, plus ChatGPT and Claude as special "guests." They discuss cognitive automation, LLMs augmenting worker productivity, the need for worker retraining and policies.
Curated collection of AI resources for economists: tools, papers, tutorials, and guides for using AI in economic research.
Paul Goldsmith-Pinkham (Assistant Professor of Finance, Yale SOM) argues AI tools compress research timelines while preserving what makes research genuinely valuable. Outlines eight research stages (ideation, design, data assembly, analysis, robustness testing, writing, submission, publication) and argues LLMs accelerate transitions between stages but don't eliminate the need for careful iteration. Addresses two central anxieties: (1) "slop" - more papers published faster with less rigor, p-hacking supercharged by automation; and (2) career anxiety - execution skills (coding, writing) now have lower market value. Claude Code features prominently. Key insight: "The hard part is still knowing where to walk." Skills that distinguish good researchers - taste, institutional knowledge, iterative thinking - become more valuable as execution barriers lower.
Thread with practical tips for economists using AI in daily workflows. Announces a project to make AI coding assistants more useful for day-to-day work - demonstrating how to prompt AI to transform pseudocode into working Stata code, suggest robustness checks. Also teases Refine, his AI referee tool.
Response to Joshua Gans's "Reflections on Vibe Researching" post. Reflects on more advanced uses of AI in research: using GPT-4 to verify mathematical derivations, generate synthetic data for testing empirical strategies. Notes current limitations but the promise: "Yes, AI can produce new maths results, although just incremental progress seems possible for now."
Ben Golub emphasizes "low-risk uses of AI" that are already yielding returns in research - e.g., using GPT-4 to summarize papers or suggest alternative phrasings. Agrees fully AI-generated research is not yet reliable, but tools like Refine help ensure quality. Balanced view: AI won't replace researchers, but researchers who effectively use AI will outpace those who don't.
Joshua Gans conducted a year-long experiment with "AI-first" research ("vibe researching"), using AI at every stage of writing papers in 2025. His conclusion: the experiment ultimately failed. While AI-generated mathematics checked out, theoretical oversights in game theory and equilibrium analysis went undetected until peer review. Reduced friction meant he pursued more mediocre ideas to completion. LLMs present results with false confidence, tempting researchers into believing they've discovered something genuine. Tools used: ChatGPT (o1-pro and 5.2 Pro), Gemini, and Refine.ink. Key takeaway: AI accelerates research substantially but cannot replace human taste, peer judgment, or the value of letting ideas develop naturally. A must-read for academics experimenting with AI in their research process. See also Antonio Mele's response thread.
Tyler Cowen shared example prompts and LLM responses covering six domains identified by Anton Korinek (2023 JEL): ideation & feedback, writing assistance, background research, coding help, data analysis, and math derivations. Curated list assembled by Jesse Lastunen. See also the example prompts page.
Anton Korinek's research page featuring working papers on the economics of AI, including the Generative AI for Economic Research series, AI Governance Handbook, and the Econ TAI Initiative.
Interactive tracker summarizing AI-related laws and proposals across countries. Shows the EU's AI Act status, US sectoral AI bills, China's algorithm regulations, etc. Indicates a trend: many jurisdictions converging on rules for AI transparency, risk assessment, and accountability, though approaches differ.
UK Cabinet Office guidance on using generative AI in government services. Outlines principles: transparency (disclose AI-generated content), accountability (human oversight of AI decisions), and security (ensuring prompts/outputs don't leak sensitive data). Includes use-case examples and procurement standards.
Region-by-region roundup of AI regulatory developments. US tracker covers federal initiatives (NIST AI Risk Framework, draft bills) and state laws on AI. Updated frequently.
Interactive scenario-planning tool to explore different scenarios for AI development and their socioeconomic impacts. Has sliders for variables like rate of AI progress, alignment success, global coordination. Helps model futures thinking for policymakers.
Eric Schmidt's stark warning: "Year 1: AI replaces most programmers; Year 2: recursive self-improvement begins; Years 3-5: AGI; Year 6: ASI." Suggests that once AI starts improving itself, it accelerates beyond human control. Added to calls for AI governance.
Essay exploring how academia and the identity of scholars evolve with AI tools. Themes: authenticity and originality, skillset shift, ethical norms. Conclusion: being a scholar still means curiosity, rigor, and critical thinking, but tools and workflows will change.
BBVA's real-time big-data/AI dashboards (card transaction data, mobility data) for nowcasting GDP or consumer confidence. Demonstrates how AI is operationalized to monitor the pulse of the economy and geopolitics in real time.
Scott Cunningham's notes on introducing students to ChatGPT for coding in R: explaining code, generating example datasets. Candid on-the-ground look at academia's adaptation. Key insight: AI can be a great tutor but we need to teach students to use AI critically.
Kevin Bryan (@Afinetheorem) prompted OpenAI's o3 about academia's future and shared the AI's response. Partly tongue-in-cheek - using AI to advise on AI issues. His timeline provides a real-time chronicle of how a top economist is grappling with and leveraging AI.
Gary Marcus on limitations and risks of AI. Emphasizes robustness, trustworthiness, and the gap between AI output and true understanding. Valuable counterbalance to hype.
Legal scholar on AI regulation and ethics. Brings Latin American and human rights angle: highlighting surveillance, data protection (GDPR), and societal impacts on the Global South.
John Horton on insights from LLM-as-agents work (Homo Silicus): what simulated agents can/cannot capture, and how computational experiments could shape future research.
A living document synthesizing all available studies and data on AI's productivity impact. Reviews the disconnect between micro studies (which overwhelmingly find positive effects, especially for low-skill workers) and macro evidence (which is now beginning to show aggregate gains). Updated regularly as new evidence arrives.
PIIE survey of the state of AI-and-labor research. Reviews high-profile studies combining AI exposure measures with employment data, noting mixed results and methodological challenges. Highlights that AI may create entirely new occupations, and productivity research shows benefits but with important caveats.
Interactive map of 1,556 U.S. universities across two dimensions: institutional resilience and post-college market positioning. Visualizes which schools are structurally positioned to weather demographic decline, fiscal stress, and AI disruption. Built with Claude.
Proposes a standard for making academic papers more accessible to LLMs — bundling papers with an llms.txt orientation file and markdown formatting. Addresses the finding that LLM-generated summaries are nearly five times more likely than human-authored ones to overgeneralize scientific conclusions.
Cowen argues the individual research paper is no longer scarce — AI can tweak, improve, or review any paper. Top economics journals are already experimenting with Refine for AI-powered reviewing. Suggests economists should focus on publishing "the box" rather than individual papers.
Accessible walkthrough of the Ludwig/Mullainathan/Rambachan econometric framework for using LLMs in empirical research. Breaks down the key insight that LLM measurement error is non-classical and correlated with covariates, explaining why a small validation sample is essential for unbiased inference even with highly accurate LLM labels.
Clark Medal winner Isaiah Andrews (MIT) wrote this note originally for his PhD advisees, then shared it widely. Lays out scenarios for how AI capabilities might evolve and what each means for economics research careers. Key advice: most students are under-investing in learning these tools; pay for the better models; and learn to audit AI output, since models produce plausible-looking results that can be wrong in ways requiring expertise to detect. Widely discussed after Tyler Cowen highlighted it on Marginal Revolution.
Panjwani compares OpenAI's Codex and Anthropic's Claude Code for economics research workflows, discussing the strengths and trade-offs of each tool for applied economists.
World Bank Development Impact blog on how AI tools like Claude Code and Stata MCP now score 77–81% on technical hiring assessments — comparable to median applicants. Rather than restricting AI use, the authors propose evaluating how candidates interact with AI tools, including reviewing chat transcripts alongside final code.
Argues that AI is a shock to relative prices inside the academic knowledge economy: as routine empirical execution gets cheaper, the value of institutional reasoning, judgment, and price theory rises. Predicts a revival of transaction cost economics and new institutional economics as AI makes it feasible to work with messier, qualitative evidence.
Argues AI won't cause mass unemployment but will trigger structural economic transformation. Drawing on structural change economics, Girard's mimetic desire theory, and consumer expenditure data, Imas shows that income effects account for over 75% of observed structural change. As commodity production gets automated, spending and employment shift toward a 'relational sector' — nursing, education, hospitality, artisanal work — where human involvement is intrinsic to value.
Substack essay arguing that advanced AI will play out very differently in low- and middle-income countries than in rich economies. High-income countries can absorb AI by augmenting their large stock of knowledge workers (~41% of employment), but LMICs employ far fewer skilled workers (under 10%), so grafting AI onto a thin layer of expertise yields limited gains. The piece suggests LMICs may benefit more from fully automated intelligence systems that handle complex tasks directly rather than copying wealthy-country adoption patterns — though this requires confronting infrastructure gaps, digital legibility, and institutional constraints. Frames the asymmetry as an opportunity for poorer countries to design novel economic structures around abundant intelligence.
·
Links
CV/Profile
Publications
Contact
Get in touch at lastunen(at)wider.unu.edu