TLDR: People argue about policy facts when they really disagree about values. Empirical uncertainty is what makes this possible. AI is changing the equation: right now it makes it easier to find evidence for whatever you already believe, but better policy simulations could eventually make some factual disputes harder to hide behind. The effects won't be uniform. In democracies this would push political conflict upward, from "does the policy work?" to "who chose the objective function?" In centralized systems the same technology would more likely harden into technocratic authority. What happens to politics when people are forced to debate values directly instead of hiding behind dueling studies?
Imagine two people arguing over where to eat. One cites reviews, the other cites wait times, and after twenty minutes of back-and-forth it becomes clear that the real disagreement is simpler: one wants Thai, the other wants Italian. The empirical debate was not meaningless, but it was not the deepest thing going on.
A lot of political argument works like this. The person who wants less immigration says "immigrants depress wages" but often the deeper concern is that their community is changing faster than they're comfortable with. The person who wants more immigration says "it grows GDP" but often the deeper pull is that people shouldn't be trapped by where they were born. The homeowner fighting a new apartment building says "traffic and infrastructure" but often they don't want their neighborhood to change. The person pushing for more housing says "supply and demand" but often they're priced out of the city they live in. The minimum wage opponent says "it kills jobs." The proponent says "the studies show no effect." Both have research to cite. Neither is quite saying what they mean.
Sometimes those empirical claims are sincere. Often the motives are mixed. But in many durable disputes, the factual disagreement also serves as a veil over a deeper normative one. Not Rawls's veil of ignorance, which hides who you are to force fair principles. This one hides what you want to make unfair ones sound neutral.
Not every policy disagreement is a disguised value conflict, and not every empirical claim is cynical cover. Often people have mixed motives and genuine uncertainty. My claim is narrower: in many durable political disputes, empirical uncertainty is a political resource, not just an epistemic problem. It allows citizens, coalitions, and institutions to translate contested values and interests into the more publicly acceptable language of outcomes, evidence, and expertise. And AI is beginning to change the economics of that translation.
The veil
Hume observed that you can't derive an "ought" from an "is," and Friedman brought the same distinction into economics [1]: what will happen is a different question from what we should want. The distinction is foundational and widely taught. It's also widely ignored in political practice, because ignoring it is useful. Political argument almost always follows a pattern:
- The Real Want. I want Policy X because it benefits me, my group, or my ideology.
- The Stated Want. I claim to want Outcome Y, something broadly appealing. Jobs, safety, fairness, growth.
- The Bridge. I argue X is the best way to achieve Y. This is an empirical claim, and crucially, it's hard to definitively prove or disprove.
- The Shield. When challenged, I defend the Bridge, a technical argument about means. Anyone opposing X now appears to oppose Y.
- The Escape Hatch. If the empirical debate gets uncomfortable, I retreat to values: "Don't you care about Y?" If the values debate gets uncomfortable, I retreat to empirics: "The evidence clearly shows X leads to Y."
Take the immigration debate.
The restrictionist side:
- Real want: my community stays culturally familiar
- Stated want: protect domestic wages
- Bridge: immigrants depress wages
- Shield: you want American workers to lose jobs?
- Escape hatch: when wage studies come back mixed, pivot to cultural cohesion. When cohesion is challenged as xenophobic, pivot back to wage data.
The pro-immigration side:
- Real want: people shouldn't be trapped by where they were born
- Stated want: grow the economy
- Bridge: immigrants raise GDP
- Shield: you're against prosperity?
- Escape hatch: when GDP gains look uneven, pivot to humanitarian duty. When duty is dismissed as naive, pivot back to economic data.
Almost nobody running this pattern is lying. Both sides have arranged genuine concerns so the actual disagreement stays off the table. These are also charitable framings. Interest and status do a lot of the actual work on both sides, and the framework applies just as well there.
Call it a motte and bailey [2] with two mottes: when the empirical argument gets hard, retreat to values ("don't you care about workers?"); when the values argument gets hard, retreat to empirics ("the studies show..."). Most people doing this don't realize they're doing it.
The broad observation isn't new. Others have written about policy rhetoric that lets aspirations masquerade as scientific objectivity [3], and about "policy-based evidence," the tendency for willingness to act to shape how people read the evidence rather than the reverse [4]. What I want to do is explain why the veil is so durable, and what might change it.
Why it persists
The veil persists for three reasons, each operating at a different level.
First, democratic argument demands public reasons. You are not supposed to say "I want to protect my home value" or "I care more about cultural continuity than aggregate welfare." You are supposed to say "this is better for affordability" or "this protects workers." Empirical language is the grammar of democratic legitimacy, not just camouflage. Interests and values have to be translated into claims about shared goods. This means even people with entirely sincere motives are pushed toward empirical framing by the norms of public discourse itself.
Second, people are good at believing their own rationalizations. Research on identity-protective cognition shows that more scientifically literate people are better at aligning evidence with their group's position, not better at following evidence neutrally [5]. The mechanism runs deep: moral intuitions come first, reasoning follows as post-hoc justification [6], and self-deception may even be an evolved strategy. Believe your own motivated reasoning and you become more convincing [7]. So most people performing the pattern above aren't cynics. They've convinced themselves their values are the empirical consensus. The self-deception does the heavy lifting.
Third, policy effects are genuinely uncertain enough that weak causal stories can survive for a long time. Decades of research on expert political judgment show that forecasters are barely better than chance, and the most confident pundits tend to be the least accurate [8]. The veil is partly real, not purely strategic. We honestly don't know what most policies will do, which means the Bridge in Step 3 can be built from almost anything and remain standing indefinitely.
These forces combine institutionally. Durable regulations tend to be supported by coalitions of morally motivated groups who provide legitimacy and financially motivated groups who provide muscle, with empirical complexity holding the coalition together [9]. The personal cost of holding false political beliefs is near zero, so voters can consume beliefs that feel good without paying a price for being wrong [10]. And much political behavior is better explained by signaling and status competition than by sincere policy optimization [11].
Robin Hanson's futarchy proposal, "vote on values, bet on beliefs" [12], is the closest existing design for institutionally separating normative from empirical questions. Prediction markets like Polymarket are the crude prototype: they can't tell us what to value, but they force empirical claims into scoreable form, making cheap talk about "what will happen" more expensive. Hanson proposes a reform; I want to make a broader claim about what happens when the veil thins on its own.
AI cuts two ways
Others have noted that AI-driven policy optimization forces normative ends to be made explicit [13]. I want to go further: AI's effects on the veil are not uniform across time.
Today, AI thickens the veil. Ask any chatbot for the strongest evidence that immigration lowers wages: you get a beautifully sourced brief. Ask for the opposite: equally compelling. The person who already has a position now has a research assistant that produces PhD-level empirical cover in thirty seconds. This is motivated reasoning with infinite patience and a citation manager. Before AI thins the veil through better policy simulation, it is already making motivated reasoning effortless at individual scale and sophisticated at institutional scale.
In the medium term, AI may improve narrow policy forecasting in bounded domains. Researchers have begun using LLM agents as simulated populations: believable social agents [15], proxies for demographic subgroups [16], and economic actors whose behavior tracks real experimental results [17]. Automated social science methods now generate and test policy hypotheses in silico [18]. On the optimization side, systems like the AI Economist use two-level RL to learn tax policy while agents adapt inside a simulated economy [19], and more recent work scales these simulations to tens of thousands of persona-conditioned agents [20], [21], [22]. Beyond agent-based approaches, investment is flowing into world models that learn to predict the next state of an environment given an intervention [23], the underlying capability a policy oracle would eventually need.
An important caveat: these systems are better described as exploratory policy labs than authoritative oracles. Benchmarks show even the best models score below 41 out of 100 on behavioral realism, and demographic conditioning often makes performance worse [24]. There are fundamental questions about whether simulated subjects can replicate the causal logic of real experiments [25]. The direction is real; the destination is distant.
All of these systems share a revealing architecture: humans choose the objective function, AI handles the optimization. That embodies, in code, the Humean is-ought distinction that political argument usually resists. The optimizer handles positive economics; the objective function is the normative economics. Cleanly separated, by design.
In the longer term, if policy models become trusted enough to narrow the space of acceptable empirical disagreement, the veil becomes harder to sustain. The relevant threshold isn't when AI "works" in some abstract sense, but when institutions begin to treat its outputs as authoritative enough to narrow which empirical claims are defensible.
Imagine credible, auditable systems that can say: "If your goal is X, then Policy A outperforms Policy B by this margin, with these tradeoffs." The model says a $15 minimum wage in a given state eliminates roughly 40,000 jobs while raising incomes for 300,000 workers. Now the person who said "it kills jobs" and the person who said "no it doesn't" both lose their empirical cover. They have to say whether that tradeoff is acceptable to them. Or on immigration: the model predicts GDP grows faster, non-college wages decline modestly, cultural composition shifts at a given rate. Now every participant has to say which numbers they care about and how much.
These are conversations democracies should be having. They rarely happen because the veil provides a more comfortable arena.
This wouldn't end politics. Some disagreement is irreducibly normative [26]. Even choosing what counts as success (GDP, median income, capabilities) is value-laden [27]. AI may relocate political conflict from first-order policy claims to disputes over objectives, metrics, and model governance. A world after the veil thins would be more explicitly political than now, not post-political.
The dark version
The same technology that could thin the veil could also be used to weave a thicker one.
An organization running millions of policy simulations with slightly different assumptions can almost certainly find a configuration producing whatever result they want. Publish that one: "Our AI model shows Policy X achieves Outcome Y." This is p-hacking at a scale nobody imagined when researchers first documented how analytic flexibility produces false positives [28]. Opaque models can harden bias into apparently objective authority [29]. And Goodhart's Law applies: once actors know what metrics the optimizer tracks, they game them.
The defense would need to include open and auditable models, standardized assumptions, adversarial review of specification choices, and institutional norms treating the choice of objective function as the primary political act. Technology alone doesn't guarantee clarity. It takes institutional design.
Regime type matters
The effects aren't uniform across political systems. In pluralist democracies, stronger models would shift conflict upward, from "does this policy work?" to "who chose the objective function?" Political rhetoric would become more explicitly about fairness, liberty, dignity, and distributional tradeoffs. That's more honest. It's also more combustible, because value conflict is harder to compromise than empirical disagreement. Many existing coalitions, held together by ambiguous factual stories, might fracture once those stories weaken.
In centralized systems, the same technology would more likely strengthen technocratic governance. Strong policy AI slots naturally into the language of scientific modernization. The state gains a tool for optimization and a stronger legitimacy claim (the model shows this is optimal) while the values remain upstream, inside party priorities, and politically uncontestable. If the objective function isn't publicly debatable and auditability is limited, AI becomes a legitimacy shield, its authority asserted rather than earned.
There's a counterargument: a single sanctioned model stack is brittle. A messy adversarial ecosystem is ugly, but it's an error-correction mechanism. Over a long enough horizon, the messy system catches its own mistakes while the efficient one compounds them.
The deepest punchline: the same capability would probably make democracies more political and autocracies more technocratic.
The questions too few are asking
The researchers building AI policy simulators are doing impressive work, separating objective functions from optimization and flagging risks in their ethics sections.
But too little of the discussion asks what happens to the structure of political argument when the empirical side becomes more tractable, or how the answer differs across regime types. The values-empirics separation is treated as a system architecture feature. The further claim remains largely undeveloped: that this separation, if credible and widespread, would restructure political discourse itself.
The irony at the center: the same technology that might eventually force more explicit political discourse is, right now, making evasive discourse more sophisticated. Whether the long run looks like the optimistic or pessimistic scenario depends on which application matures faster and which gets institutional support.
Before we ask whether policy AI is accurate, we should be asking: Who chooses the objective function? Who audits the assumptions? Who gets standing to contest the model? What happens when institutions start treating those outputs as authoritative?
Go back to the minimum wage. If the model says $15 eliminates 40,000 jobs and raises 300,000 incomes, "does it kill jobs" stops being a political question. "Is that tradeoff acceptable" starts being one. Someone has to answer it. Whose values count in the answer is the whole fight.
The empirical side of politics is getting more tractable. The normative side isn't. The conflicts of the next decade will be about metrics, assumptions, and model governance, not about whether policies "work." That's a better fight to have. It's also harder. And the veil won't be there to cover it.
References
- Friedman, M. (1966). "The Methodology of Positive Economics." Essays in Positive Economics.
- Shackel, N. (2005). "The Vacuity of Postmodernist Methodology." Metaphilosophy.
- Wagner, R. (2014/2022). "The Peculiar Language of the Public Policy Shell Game." Rethinking Public Choice.
- Marmot, M. (2004). "Evidence based policy or policy based evidence?" BMJ.
- Kahan, D. et al. (2017). "Motivated Numeracy and Enlightened Self-Government." Behavioural Public Policy.
- Haidt, J. (2001). "The Emotional Dog and Its Rational Tail." Psychological Review.
- Trivers, R. (2011). The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life.
- Tetlock, P. (2005). Expert Political Judgment: How Good Is It? How Can We Know?
- Yandle, B. (1983). "Bootleggers and Baptists: The Education of a Regulatory Economist." Regulation.
- Caplan, B. (2007). The Myth of the Rational Voter.
- Hanson, R. & Simler, K. (2018). The Elephant in the Brain.
- Hanson, R. (2013). "Shall We Vote on Values, But Bet on Beliefs?" Journal of Political Philosophy.
- Mökander, J. & Schroeder, R. (2024). "Artificial Intelligence, Rationalization, and the Limits of Control in the Public Sector: The Case of Tax Policy Optimization."
- Bank of England (2025). "Agent-Based Modeling at Central Banks: Recent Developments and New Challenges." Staff Working Paper.
- Park, J. S. et al. (2023). "Generative Agents: Interactive Simulacra of Human Behavior."
- Argyle, L. et al. (2023). "Out of One, Many: Using Language Models to Simulate Human Samples." Political Analysis.
- Horton, J., Filippas, A. & Manning, S. (2023/2026). "Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?" NBER Working Paper 31122.
- Manning, S., Zhu, H. & Horton, J. (2024). "Automated Social Science: Language Models as Scientist and Subjects." NBER Working Paper 32381.
- Zheng, S. et al. (2022). "The AI Economist: Taxation Policy Design via Two-Level Deep Reinforcement Learning." Science Advances.
- Karten, S. et al. (2025). "LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra."
- Kazinnik, S. & Sinclair, T. (2025). "FOMC In Silico: A Multi-Agent System for Monetary Policy Decision Modeling."
- Yang, R. et al. (2025). "OASIS: Open Agent Social Interaction Simulations with One Million Agents." See also Pang, X. et al. (2025). "AgentSociety."
- Bruce, J. et al. (2024/2025). "Genie / Genie 3." Google DeepMind.
- Lupo, L. et al. (2025). "SimBench: A Benchmark for Human Behavior Simulation."
- Gui, G. & Toubia, O. (2023/2025). "The Challenge of Using LLMs to Simulate Human Behavior."
- Rawls, J. (1993). Political Liberalism.
- Sen, A. (1999). Development as Freedom.
- Simmons, J., Nelson, L. & Simonsohn, U. (2011). "False-Positive Psychology." Psychological Science.
- O'Neil, C. (2016). Weapons of Math Destruction.