We want a flourishing, prosperous, free society. Given the powers of AI, good institutions and governance are more important than ever. To achieve this, we should build tech and enact policies that democratize our institutions, connecting them closer to the people they serve.
However, due to the intelligence curse, institutional solutions alone won’t be stable. The intelligence curse is especially worrying to the extent that it bites during the period when the institutional and policy-related work is being done—which could range from a gradually-falling labor share of income to an immediate and outright coup—and because of what it implies about long-term stability.
Therefore, we want to diffuse decentralized technology that uplifts human economic relevance as much as possible.
However, the diffusion of powerful technology creates risks of bad actors (whether human or AI) causing havoc. Indirectly, the threat of this havoc also creates reasons to centralize and securitize, which threatens the ability to diffuse. Therefore, we need to harden the world against security threats from misuse & AI. We need to avert catastrophes.
To achieve these goals, we need to work backwards, addressing each issue at the source.
Avert
First, we need to avert security risks from AI proliferation, including rogue AIs and catastrophic misuse of AI by humans.
Doing this is good for the sake of it. AGI could assist bad actors in creating new security threats or causing catastrophes—or, if misaligned, be a security threat on its own. We should prevent these potential catastrophes from occurring. The case for the plausibility of a catastrophic threat from advanced AI technology has been made elsewhere at length, and we will not repeat it here.
But there are two paths to averting these risks. You could lock down the labs, centralize the technology, and prevent it from proliferating. Or, you could build technical solutions to solve AI’s potentially catastrophic risks.
We strongly endorse the latter, because the former is the most likely way to trigger the intelligence curse.
Averting Catastrophe Enables Liberty
Technology that removes the threat of catastrophe enables safe decentralization by removing the incentive to lock down, pause, or centralize—all of which require dramatic concentration of power into the hands of a small number of actors. If we fail to adequately manage AI risks, we could face a catastrophe resulting from rogue AI or an engineered pandemic, or a “warning shot”–that is an AI-powered event that results in a non-existential catastrophe and might be a harbinger of worse to come.
The threat of such catastrophes has inspired various centralizing proposals. PauseAI’s Proposal would create a global governance regime that could unilaterally decide when AI models over 1 billion parameters (that’s smaller than GPT-2) could be trained and when any general purpose model could be deployed, even in the face of objections from individual countries. As they concede (though do not provide a recourse for), “centralization of AI might make takeover risks worse” by creating “a single point of failure, which human greed and stupidity could take advantage of.”
Another example is found in Bostrom’s paper on the “The Vulnerable World Hypothesis”. He proposes a “High-tech Panopticon”, a double-Orwellian1 method of preventing extinction if technology enables regular people to cause mass catastrophes:
“Everybody is fitted with a ‘freedom tag’ – a sequent to the more limited wearable surveillance devices familiar today, such as the ankle tag used in several countries as a prison alternative [...]. The freedom tag is a slightly more advanced appliance, worn around the neck and bedecked with multidirectional cameras and microphones. Encrypted video and audio is continuously uploaded from the device to the cloud and machine-interpreted in real time. AI algorithms classify the activities of the wearer [...]. If suspicious activity is detected, the feed is relayed to one of several patriot monitoring stations. These are vast office complexes, staffed 24/7. There, [...t]he freedom officer then determines an appropriate action, such as contacting the tagwearer via an audiolink to ask for explanations or to request a better view. The freedom officer can also dispatch an inspector, a police rapid response unit, or a drone to investigate further. In the small fraction of cases where the wearer refuses to desist from the proscribed activity after repeated warnings, an arrest may be made or other suitable penalties imposed. Citizens are not permitted to remove the freedom tag, except while they are in environments that have been outfitted with adequate external sensors (which however includes most indoor environments and motor vehicles). [...] Both AI-enabled mechanisms and human oversight closely monitor all the actions of the freedom officers to prevent abuse.”
Other proposals, including Aschenbrenner’s proposal of locking down the labs and launching “The Project”, and similar “put all power into the national government” policies face the same problem: they create authorities that, upon achieving and diffusing AGI, would have unilateral control of global technological advancement and would simultaneously control the means of economic production.2
History is riddled with examples of calls for centralization in the hands of one actor, followed by a promise that such an actor will use their power benevolently or dissolve themselves.
For example, under Marxist-Leninist theory, after a socialist revolution, a temporary “dictatorship of the proletariat” should be established where power (both economic and political) is centralized in the hands of the state, controlled by the proletarian via the Communist Party. This, in theory, would be used to empower the proletariat and repress the old bourgeoisie order. Marxists theorized that this would lead to “the withering of the state”, eventually achieving communism–a classless, stateless society.3 In practice, however, this centralization gave Stalin the power to implement some of the most draconian policies in history under a strengthened state, which had no incentive to fade away.
We trust history and incentives, and both paint a bleak picture of how humans would fare in this world–disempowered, exploited, and at the mercy of actors they have little ability to influence.
We expect that, while these policies are politically infeasible today, they would be unlocked following some kinds of AI warning shots.4 Historically, catastrophes create the environment for government power grabs, just like the ones described above. Once enacted, they will lead to centralization and enable authoritarianism.5
If you are a proponent of human liberty or technological progress, you should be the strongest advocate for technologies that mitigate AI’s potentially catastrophic risk. If we don’t do this and a catastrophe occurs, the most likely policy outcomes are one-way tickets to the intelligence curse.
Technology to avert catastrophe
Defensive technologies in the line of Vitalik Buterin’s d/acc proposal and Bernardi et. al’s societal adaptation framework enable a Swiss Cheese approach to AI risk mitigation, where no one layer eliminates all risks but the combined layers make them extremely unlikely.

We endorse this approach, which balances safety concerns with authoritarian risks. Below, we outline specific technologies that, if implemented, could lower risks to an acceptable level in each key issue are. We focus on what we believe are the most likely catastrophic threats from AI: misuse risk (biosecurity, cybersecurity, and physical security), misalignment, and loss of control.
Biosecurity matters, most critically for preventing pandemics. AI might make it easier to engineer pandemics, though this will remain bottlenecked on physical materials and wet lab skills. Methods for stopping pandemics from starting include:
- KYC (know-your-customer) and purchase-tracking tools to bring a high level of oversight to the purchase of potentially dangerous biological materials, similar to anti- money laundering infrastructure.
- Screening of orders from DNA synthesis providers, which is currently a voluntary standard and mostly focused on known pathogens, but should expand to include AI estimation of the pandemic potential that would catch even novel pathogens.
- Wastewater monitoring to detect any pathogen that is increasing quickly.
If a pandemic has already started, slowing it down will benefit greatly from:
- UV-C lighting in HVAC systems to kill pathogens that are circulating in the air, doing for air what filtration and chlorination did for the water supply in the late 1800s and early 1900s (ending typhoid, cholera, and dysentery epidemics).
- Haze (triethylene glycol) is safe to breathe and kills pathogens, potentially even more effectively than UV-C. It is a chemical precursor in a lot of supply chains, so it could be easy to mass-produce quickly in the event of a spreading pandemic if work on distribution is done ahead of time. Deploying it in high-risk sites like hospitals or ports could slow the spread of a pathogen.6
- Rapid distribution of vaccines would help, though currently regulatory approval for new vaccines requires clinical trials that are a bottleneck on speed.
As a side-effect, decisively dealing with pandemic threats might also mostly solve infectious disease.
Cybersecurity can roughly be split into “hard” cyber focused on technical vulnerabilities, and “soft” cyber focused on access management, operational security, and preventing social engineering attacks.
On the technical side, perhaps the biggest single risk from AI cyber offense is unprecedented amounts of hacking effort being spent on legacy code maintained by organizations without deep technical competence, especially when this code controls physical infrastructure (code handled by technically-competent organizations will likely be upgraded quickly). Some approaches that help with technical cybersecurity risks are:
- Formal verification of code currently requires lots of bespoke mathematical work, but AI might make this feasible at scale.
- “The Great Refactor”: using AI to rewrite many existing codebases from the ground up to be more secure and maintainable.7
- AI might bring down the cost of human-like flexibility in classic vulnerability detection methods like static analysis, fuzzing, and penetration testing.
- Hardware security will matter more, as there will be more incentive to attack chips (especially if software vulnerabilities are patched through the above, or nation-state -level actors want to damage or spy on AI hardware of competing nations). Tamper-proof chip enclosures are one approach.
On the operational side, perhaps the biggest risk is automated social engineering (which has already been responsible for major cyber incidents). Solution approaches include:
- LLM scanning of incoming messages will help against spear-phishing.
- LLMs will make monitoring logs for signs of attack easier.
- AI can also help with fine-grained permission management, which is currently a major source of complexity in high security IT, improving both productivity and security at the most security-conscious organisations (e.g. intelligence agencies, the military, and hopefully AI labs).
Physical security advances might also become important in a world of cheap and autonomous drones or robots with lethal capabilities.
AI alignment ensures that AIs pursue the goals that their creators give them, avoiding rogue AIs. While the other items in this category are about hardening the world against harm from AIs or AI-boosted humans, alignment is about making the AIs intrinsically less harmful—but both serve the same goals of reducing the chance of catastrophes and reducing the need to centralize to prevent those catastrophes. Alignment agendas have been discussed at length elsewhere8, but in brief:
- Scalable oversight is about figuring out how to give accurate feedback to powerful models, to avoid incorrectly rewarding incorrect or duplicitous behavior from models. RLHF is an example; other work strands include weak-to-strong generalization and AI safety via debate (theory; empirical work).
- Interpretability aims to understand what neural networks are doing, in hopes that this then lets us verify and/or steer model behavior. Mechanistic interpretability aims to understand the final trained models, while developmental interpretability studies how models learn.
- Automated alignment research aims to punt the above problems to the AIs.
AI control aims to make sure that even misaligned AIs cannot cause havoc. It is in-line with a standard security mindset where you want security to hold even if you’re making minimal assumptions about a system. This should be our stance towards AIs until we have good evidence on alignment.9
Policies to support this
Our key policy ask is for government-supported moonshot projects for the risk-reducing tech we outline above, modeled after Operation Warp Speed.
There are other ways in which policy could support the technical interventions outlined above as well. This is especially true for biosecurity threats, which are the hardest for the private sector alone to solve. In particular, governments should mandate KYC (know-your-customer) rules for DNA synthesis providers, fund wastewater monitoring for pathogens, and ban gain-of-function research—the creation of pandemic-potential pathogens in the lab for dubious information gain.
Diffuse
Second, we need to diffuse AI widely. This has two parts.
First, we want to align human capabilities with the needs of institutions, by uplifting humans. If humans can provide the things that powerful states and companies need, the interests of power will naturally lead to investment in humans. We should develop and diffuse AI-enabled technology that augments human productivity and keeps humans in the loop of economic value production.
Second, diffusion helps to decentralize10 in ways that prevent dangerous power concentration.11
If the personal computing revolution had never taken off, computers would have continued being a centralizing tool that helps large companies and bureaucracies consolidate power. But with the personal computing revolution, computing became a decentralizing force that helped uplift everyone’s capabilities, while also enabling breakthrough startups that disrupted the status quo.
In the short-run, building human-augmenting technology means a wide variety of humans continue producing value for longer. This makes decentralization more likely to occur: instead of just a small cohort of AI companies and their suppliers capturing value as they gradually automate the rest of the economy, human-augmenting tech makes everyone more competitive, helping them earn capital and resources, and retain and develop pools of knowledge, data, and experience that guard against winner-take-all centralization. We should aim for a period—as long as possible—where humans and AIs specialize in complementary tasks and have symbiotic economic roles, rather than taking the shortest route to full AI substitution of all human labor.
In the longer-run, AI capabilities will advance enough that human economic competitiveness will become rarer and rarer. The human role will increasingly move to one of delegation, ownership, and value-setting, as well as likely maintaining relationships with other humans and perhaps interfacing with the legal system. By this time, we expect there will have been decentralization of the creation, ownership, and control of the value-creating parts of the AI economy that keeps humans in the loop of the economy, even as the economy decouples from direct human labor.
Short-term: extend the human-in-the-loop period to enable decentralization
Everyone agrees that human-augmenting tech would be desirable, but many of those who have “woken up” about AI think AI progress will just be too fast.
It’s true that there is no fundamental theoretical blocker to AI being able to complete every task that humans can. It’s also true that AIs have more flexibility in their hardware and software than humans. This will mean AIs could eventually be faster, cheaper, and more capable than humans, at least in theory—but it’s uncertain how quickly this could be realized. There is reason to believe that the period of augmented humans being state-of-the-art exists and lasts years, that this period can be extended, and that extending it is valuable.
First, consider the current state. The fastest-growing AI startup is Cursor, a coding tool that puts the human firmly in the driver’s seat, and more so than in many competing, less-successful products. METR’s work shows that AIs are getting better at solving tasks with longer and longer time horizons, but on current trends they will take almost 7 years to reach a 1-month time horizon and almost 9 years to reach a 1-year time horizon with 80% accuracy on completed tasks. True, algorithmic breakthroughs among other things are very likely to speed up progress here, but also note that METR’s results are on clearly-defined software engineering tasks that don’t require deep context. We expect hard-to-judge, vague, context-rich tasks to take longer for AIs to crack. It will be hard to compile the dataset, and hard to build the RL environment.12 These moats will not last forever, but we believe that we have at least a few years.13
Second, extending this window is valuable, both for governance and decentralization. The longer human economic relevance lasts, the more time there is for people to wake up to AI, and for discussion and movement-building around governance.14 Political change can take time, and the intelligence curse is likely to bite much harder and faster if the society both wakes up to full automation and then gets automated within one election cycle. “Shock therapy”,15 where humans are left unemployable overnight, will likely also lead to a more extreme and chaotic political reaction.
As discussed above, extending the human-in-the-loop period for as long as possible also helps decentralize AI: rather than a few AI labs making a breakout run to seize the economy, the uplift provided to AI diffuses more widely, allowing a much greater number of actors to accumulate skills, ownership, specializations, and experience in the AI-enabled economy. This might help keep the balance of power in society much healthier, without needing to rely on government redistribution and antitrust alone. Overall, we want to extend the period during which humans are needed to meet the needs of powerful actors,16 which in turn extends the period during which states and companies have unlegislated incentives to care about humans, and gives more actors time to get a foot in the AI-enabled economy before human relevance ends.17
Third, it is possible to extend this time window through differential technological development. A focus on short AGI timelines and the inevitability of the AGI race as the overriding brute facts of our time is likely correct, but can easily obscure that there are needles we can move.
Building tech for human capabilities
We should build technology that is a complement rather than a substitute to human labor. Tools like hammers and computers make humans more effective at their work, and so are usually complements to human labor. Generally, even if labor-complementing tools might shift the landscape of jobs and tasks, they generally lead not just to more growth and abundance overall, but also often tend to increase the returns to human labor and therefore increase human wages.18 However, the vision of AGI is human-substituting in its very definition: general intelligence, that does everything a human can.
Agency and human-likeness has taken over everyone’s conception of AI. But in addition to human-like agents, there are many other types of helpful intelligences: tools, world models, information retrieval, pattern completion, advisors, and collective intelligence—implemented by systems like APIs, prediction markets, Community Notes, and so on.19 We can also decompose agency into parts: goals, situational awareness, planning, implementation, and actions are all components of an agent. These do not have to be assembled into one single artificial entity, and AI is currently progressing far from uniformly on these axes.
Instead of “unitary agents” that do all of these functions, we should accelerate the development of AI systems that perform subsets of these, with humans or other systems filling in the gap. There are reasons to think that agents are the most competitive in the long run and approaches that factorize agency are eventually uncompetitive, but at the moment it seems like long-horizon agency is one of the things AI is worst at and many avenues for AI development—including many of the most immediately-profitable ones—are not about creating unitary agents. Humanity should resist the memetic forces pushing along the AI agent hypetrain, and differentially accelerate other branches of the tech tree.
Consider the CEO of a company. A CEO is an important part of a company, even if for everything the CEO does there is someone in the company better at it. Humans might remain in charge and in control, acting as a CEO or executive function to teams of AIs even once the AIs are superhuman at most tasks.
In particular, given the current pattern of AI capabilities, we expect many of the most effective products will leverage human direction, understanding of context, and ability to deal with exceptions, to drive AI systems that do most of the work. Humans could continue providing value through good taste in judgement and strategy. Top-down control of society by a few AI systems also suffers from the same problems as central planning. Hayek argued for the importance of unwritten tacit and local knowledge in managing the economy, and how this makes distributed and decentralized control necessary. As we’ve argued before, there are good reasons to think distributed control remains more effective than centralization in the AI economy, and even better reasons to push technology that helps keep the production of value decentralized, rather than enabling top-down control by the few or a singleton AI system.
Below we give some starting points for what technology to build to enable diffusion.
Pro-human user interfaces. We have not yet seen Steve Jobs-level product insight and design applied to any AI tool. Effort is increasingly spent on developing AI agents rather than AI tools. This should change.
Increasing AI-human bandwidth and decreasing latency. This lets humans incorporate AIs more solidly into their workflows and direct them faster and more carefully, making symbiotic human-AI systems more competitive.
- Augmented reality tools could help humans make decisions and take actions while receiving information at a high rate from AIs. One vision of very powerful such tools is given in this story.
- Brain-computer interfaces (BCIs).20 Instantaneous human-to-AI feedback via BCI allows humans to be effective overseers and managers of AIs, and integrate more tightly into human+AI systems21. BCIs should be noninvasive to reduce adoption barriers.
Localized AI capabilities could decentralize power from the labs and help keep more actors economically relevant.
- Easy finetuning of AI models so that people, small businesses, and startups can create AI finetunes that embody their local knowledge as well their personal judgement, taste, and sense of direction. To the extent that local and personal knowledge/taste is important, this will help non-lab actors stay relevant and competitive, by scaling their local knowledge and taste with AI productivity—see here and here for more on taste and local knowledge. Moreover, rather than aligning to a nebulous concept of overall human values or a company’s preference, such systems could be aligned to individual users themselves.
- Decentralized robotics. Data is a major bottleneck in robotics, and Moravec’s paradox suggests that robotics might lag behind other AI capabilities. Robotics might remain based on task-specific finetuning, as is currently the case even for state-of-the-art deep learning -based robotics. This might create a world where the data and task-specific finetunes for manufacturing robots are distributed across many actors, rather than centralized into a small number of large companies, especially if we can push open-source robotics hardware and base models, and make robotics fine-tuning easy. However, one big algorithmic breakthrough in robotics data generalization could break this possibility.
- Helping humans own & control local data. If AI can cheaply do any valuable processing or deduction work when given some data, the ability to do intellectually valuable work will increasingly be bottlenecked by whether you can physically and legally give the required inputs to the AI, rather than by the information processing itself. Helping individuals and small businesses collect & manage their own data, and then protect that data from centralised AIs (through methods from using open-source LLMs instead of AI lab APIs, to deliberate obfuscation and keeping data off the public internet while it’s still useful), would help the balance of power. This could be combined with data marketplaces and other systems that reduce friction in data trades while letting data owners profit from their data, as long as participating in such trades does not irreversibly and unfairly cheaply give up existing data moats in a way that centralizes power.
- Distributed training runs, such as what Prime Intellect is doing, might allow decentralized groups to train AI models.
- Local compute for running powerful models. Much of this is downstream of GPU prices, and eventually we might hope for a GPU in every home, much as computers went from unaffordable to everyone owning one (at least in the form of a phone). However, in the meantime performant GPUs are very expensive, and while some are trying, LLM inference is made cheap through maintaining high throughput by pooling requests from many users. Confidential computing technologies could let you run workloads on data centers with attestable security and privacy guarantees. Better tools for distributed infrastructure would allow a larger number of players to spin up their own compute clusters that they control, and reduce the cost barrier to controlling your compute.
- Cheap AI in general, especially open-source AI. A bad outcome is if, say, a system that can mostly substitute for some high-skill job costs \~$20,000/year—an amount that lets a company replace an employee, while making it hard for an individual human to benefit from it. While LLM inference prices are falling exceptionally quickly, there might be an intermediate period where the pricing is particularly disadvantageous to consumers and small companies while allowing incumbents to steamroll ahead. OpenAI, for example, reportedly plans to soon charge up to $20,000 per month for its most advanced AIs.22 Open-weights and open-source AI in particular helps put price pressure on AI labs that prevents this state of affairs from lasting long.
As mentioned, we don’t pretend that human augmentation can be an infinitely durable fix. However, we also reject a strand of thinking that is only willing to consider permanent solutions. In the future, we will likely know more, be wiser, and have had at least some surprises thrown at us by the course of events and the tech tree. We might also have incredibly intelligent AIs at our disposal too. If we get to that future with a flourishing democratic society, that is a good first step.
Apart from its finiteness, another potential issue with technology for human augmentation is that it might further raise the returns to human talent and the talent bar to compete in the economy. One of the places where joint human-AI systems are most likely to be most helpful is at the very frontier, where AI capabilities are still patchy. We expect AI making everything easier will increase the number of people who can reach the frontier, but it will also result in outcome distributions with even fatter tails than today.23 If even a small fraction of humans are economically relevant, states and companies are still incentivized to invest in humans to cultivate outlier talents. However, greater income inequality is one force that will push for power concentration. Redistribution will become more important, as will developing a culture of noblesse oblige.
Long-term: decentralization & user-alignment keeps humans in the loop
Having had an extended period of human-AI symbiosis and human involvement in the economy even as AI advances will hopefully have helped a wide set of actors gain AI-derived wealth and create and own parts of the AI economy. This will mean more decentralization and less power concentration. It will mean that more humans have owned rather than borrowed power.
Strong democratic institutions, which we discuss in the next section, will be increasingly important in this world. However, there is one technology that might be key too:
Alignment to the user. Most alignment work prioritizes aligning to some generic concept of human values (or—and this is much more likely to happen by default—a corporate statement or political compromise). It assumes that instruction-following on behalf of the models is all the per-user specialization needed. However, we expect that for models to successfully act on users’ behalf in most functions of the economy and the world will require their high-granularity, detailed alignment to each individual user. This could create an economy of agents, each of which is directly tied to one person. The agents’ activities earn that person income and rely on the user’s judgment, taste, and tacit knowledge, keeping them involved in the creation of value. The state would tax the income of the person rather than the activities of the agent.24
Policies for diffusion
Upskilling humans in the areas which will bottleneck the AI economy. AI systems are likely to have uneven capability profiles compared to humans, excelling in tasks with easy verification, low time horizons, and a lack of interfacing with the physical world. Naturally, these will create bottlenecks which humans will be able to fill to stay relevant in the economy. There is a race between human upskilling and retraining on one hand, and AI labs smoothing over the jagged performance frontier on the other.25
- AI tutors for job changes. We expect that the changes to the economy will come at historically unprecedented speeds, and require faster upskilling than in the past.
- Finding good techniques for AI oversight and training humans in them.
- Educational experiments, like new types of schools and educational programs. The current education system, which focuses on short-horizon, easily-gradable tasks, teaches exactly what AI automates.
- Better forecasting of AI capabilities and their bottlenecks. We need better forecasts and understanding of what the economy will need and is bottlenecked on.
Policymakers should ban AI systems from owning any assets, serving as a C-Suite member of a company, servicing on a board of directors, or owning shares. This sounds silly now, but it’s important to enshrine a principle that humans own the top of the funnel now before systems are good enough for companies to try to delegate these roles.
Democratize
Third, we should democratize, by making institutions more anchored to the desires of the humans they are supposed to serve. To complement the alignment of human capabilities with institutional needs that decentralization achieves, we should also develop technology that helps align institutions with humans.
This is important because the intelligence curse weakens institutional incentives to care about humans. While we’re hopeful that the diffusion steps outlined above will solve a large chunk of the intelligence curse, strengthening institutions is an important complement to that. It will also become increasingly important as AI capabilities grow and human economic relevance declines.
Moreover, AI will also help centralize power, making top-down control more plausible through the automation of effective decision-making, surveillance, and enforcement. The more powerful institutions become, the more carefully we need to design and align them.
Finally, to keep humans economically relevant, we’ll need to pass policies that move AI benefits towards regular people relative to the default. If a country’s leaders are easy to corrupt or if gridlock prevents actions, that country will struggle to adapt to this. But if a country is a stable, effective democracy, it can override capital incentives and niche interest groups, providing voters a way to prioritize their own goals over those of their elites.
There is not one single innovation that solves all these problems. However, we will list some technologies that help build stronger, more democratic institutions.
Democratizing technologies
Representation.
- Digital advocates (proposed by Kulveit & Douglas et al) that allow policymakers to assess the values and opinions of a given population. Models aligned to individual users in detail, as discussed in the diffusion section, naturally enable digital advocates.
- Large-scale feedback collection that allows policymakers to get more fine-grained and qualitative data about citizens’ preferences than current simple numerical opinion polling does. The AI Objectives Institute’s Talk to the City project is an early example. Imagine a politician who can sit down with an AI, and, on any question, get a level of understanding about voters’ preferences and conditions that was as if Tocqueville had spent a year travelling among the voters and then writing up an analysis.
Verification & trust.26
- Human verification is a useful primitive for many things, including gatekeeping services from online forums to company registration to humans, and distributing government benefits to citizens amid the sea of impersonation and fraud that AI will make cheap. For example, anonymized biometric verification tokens (like the World Network, formerly Worldcoin), aim to prove someone’s humanity without passing on their biometrics.
- AI systems as trusted third-party auditors. It is difficult to trust a human auditor with sensitive information, and human auditors are expensive. AI auditors could have superhuman speed, cheapness, and reliability, and we might be able to have both verifiable privacy of the information they audit as well as of the auditor’s integrity. Imagine for example being able to verifiably run a specific auditing program (in the simple case, an LLM prompt) against verifiably private information. This could help with anything from governments giving assurances to citizens, to companies coordinating with each other, to the verification of international arms-control treaties.
- AI systems as trusted third-party advisers. An issue with human advisers is that their perspective is often (correctly) seen as biased or self-serving. With LLMs, we have something like a “point-of-view from nowhere”—an intelligence trained on the collected texts of humanity, without a personal agenda. “ChatGPT said so” is already sometimes used as a proxy for a fair-minded arbiter.
- AI-powered tracking of government activities. AI could democratize the ability to have intelligence agency -level analysis and insight into a chosen actor. While this poses many privacy risks to individuals, society could use this to track government actions and uncover corruption. For example, imagine a platform on which AIs automaticaly collate information about which companies have lobbied for a bill, and what changes they’re likely pushing for.
Coordination.
- Contract negotiation is time-consuming, especially when the matter is complex and there are multiple parties involved. AI could help parties that previously would’ve found it too time-consuming and expensive to coordinate to negotiate a contract.
- Automated AI-based enforcement of contracts could be used—thoughtfully—to help actors commit to actions. Simple examples include bets resolving automatically based on AI judgements, or payments to a contractor triggering automatically on the satisfactory delivery of work.
The information environment is critical for a functional democracy, for sane decision-making anywhere in society, and for a strong, effective culture.
- Distributed fact-checking systems like X’s Community Notes at scale.
- “Internet gloves” where users can use AIs to pull information from platforms in selective, non-addictive ways, without being sucked into the platform.
Democratizing policies
Alongside this, policymakers should take immediate action to strengthen democracies. Weak democracies will crumble under the weight of AGI. This would include:
- Passing campaign finance reform
- Reforming anti-corruption laws
- Strengthen bureaucratic competence while reducing bloat
Governments should make courts and legislatures faster. Coordination around legislatures and the processing times of court cases might be glacial compared to the speed of either AI advances, or to the speed at which an AI-enabled executive can act.. This creates a threat that the executive branch can become effectively the sole and unchecked arbiter.
Governments should preemptively prepare for a world where lots of regular people don’t provide immediate economic value, even if that never materializes or if some people still do. If this comes to pass, they should be ready to implement a myriad of measures to distribute AI’s economic benefits to the disenfranchised. This could be a sovereign wealth fund with public ownership stakes in highly automated companies, with requirements to distribute a set percentage directly to citizens. It could also look like constitutional requirements that governments meet basic needs. Both moves could stimulate a human economy, preventing the shuttering of consumer-facing industries while simultaneously enabling people to use this wealth to launch ventures of their own.
You now have a roadmap to break the intelligence curse. What will you do with it?
-
Double-Orwellian in the sense that it is both an Orwellian policy and that it uses newspeak to cloak its draconian policies in pro-freedom language. ↩
-
If AI becomes a substitute for human labor, centralizing it in the hands of one actor is in practice centralizing the means of economic production into that actor. This is akin to central planning. ↩
-
See Lenin’s The State and Revolution ↩
-
We hold that there is a five-part taxonomy of warning shots, of which two clearly result in centralizing safety policies being enacted by governments. In order of least to most likely to trigger dramatic policy changes (a similar point has been made by Anton Leicht here): (1) A warning shot that looks like a malfunction or glitch: a small-scale AI disruption without major loss of life that is plausibly because of an error, rather than human misuse or an intentional nefarious action by an AI system. This could be a mistake from an autonomous weapon or a cyber-attack with limited disruption. Actors may be incentivized to label many warning shots as glitches even if a model intentionally took those actions for the purpose of gaining power or causing harm. We expect no policy changes from this. (2) A limited human-enabled warning shot: a human uses a system to cause small-scale disruption without major loss of life. This could be similar to the first type of warning shot or somewhat greater. We expect minimal policy changes from this, mostly targeting liability or criminal penalties for human misuse. (3) A warning shot that originates from a foreign entity of a rival country: A rogue AI of foreign rival origin or a foreign group of a rival country using AI causes a large-scale catastrophe. We expect dramatic policy changes aimed at accelerating domestic AI progress in this scenario. (4) A large scale human-enabled warning shot: A human commits a terrorist attack or other major catastrophe using AI. We expect dramatic policy changes towards centralization in this scenario—but only if it’s clear that it was AI-enabled. In the case of an engineered pandemic, its origins and the extent to which it was AI-enabled might be unclear for years afterwards, for example. (5) A large scale autonomous warning shot: A rogue AI system commits or is caught trying to commit a catastrophe that results or would result in catastrophic loss of life. We expect dramatic policy changes towards centralization in this scenario. ↩
-
For additional evidence, see Higgs’ Crisis and Leviathan which argues that in modern times the powers granted to government tend to ratchet up during times of crisis and not abate afterwards (summarized here), and Ole Wæver’s chapter “Securitization and Desecuritization” which emphasizes the power of just the speech act of framing something as a security issue in justifying extraordinary measures. ↩
-
Thanks to Andrew Snyder-Beattie for drawing our attention to triethylene glycol. ↩
-
Thanks to Herbie Bradley and Girish Sastry for this idea, expanded on in a forthcoming work of theirs. ↩
-
For example, see the overviews of the alignment agenda landscape given here or here. ↩
-
For more on AI control, see the original paper here, and follow-up work here and here. ↩
-
It is often argued that rogue AI takeover risk is minimized if all AI development is centralized in the hands of one actor, which can then proceed carefully and without race dynamics. However, it is underappreciated that rogue AI takeover is slowed if the rest of the world is more capable. If you think that the rogue AI will definitely undergo recursive self-improvement that lets it bootstrap to a very high power level, then you want to minimize the chance that a rogue AI is ever created. But if you think recursive self-improvement will not be incredibly fast, then any rogue AI trying to take over will find it harder the more AI capabilities are diffused through the rest of the world. Thus, in the most likely world, AI diffusion is good for reducing AI takeover risk. ↩
-
Of course, some types of restrictions on AI help with power concentration. In particular, restricting the AI capabilities of totalitarian states is good for power concentration risks. ↩
-
As an OpenAI researcher put it: “We do not rise to the power of our RL optimization algorithms—we fall to the hackability of our RL environment”. ↩
-
An additional ray of hope is based on recent work from Epoch, which argues that most AI value will come from general automation rather than automated AI R\&D, and that AI R\&D might be significantly harder than automating labor (see also this piece from Jack Wiseman & Duncan McClements that makes related points). This could incentivize AI labs to prioritize using their limited compute for widespread deployment in the areas where it’s already possible, over R\&D to crack fully-general human-replacing AI. Thus, profit incentives might actually keep humans advantaged for longer. ↩
-
It’s also true that progress being too slow could result in a boiling frog effect. However, we expect AI progress to be fast enough that this is not an issue. ↩
-
For a historical example of quick economic transitions enabling power consolidation away from regular people, see how shock therapy in Russia led to the rapid rise of the oligarchs who quickly gobbled up all resources. The new oligarchs formed an interdependent relationship with Yeltsin. Later, Putin used this power to cement himself as an authoritarian leader, ultimately unshackling himself from the constraints of even the oligarchs. For more, see Rosalsky’s summaries of this here and here. ↩
-
Thank you to Liam Patell and David Duvenaud for suggesting this phrasing. ↩
-
See Huang & Manning (2025) for a thorough explanation as to why pre-AGI measures are preferable relative to post-AGI redistributive policies. ↩
-
Note that this sentence is intentionally hedged as there are many factors and subtleties that differ between roles and sectors. ↩
-
Eric Drexler has written extensively about these topics, and we’re also indebted to several discussions with Tom Everitt about his forthcoming work on related issues. ↩
-
See here for some speculation on what BCIs enable in the longer-term. ↩
-
Note that BCIs also increase some authoritarianism risks by letting governments read (and maybe eventually control) minds. However, we expect a combination of privacy tech & practices, and good institutions and laws, will make BCIs net-good. ↩
-
An important concern here is: but once the powerful AI gets even cheaper, won’t incumbents with deeper pockets be able to afford more expensive super-powerful AIs that steamroll those with merely powerful AIs? This depends fundamentally on whether returns to intelligence are increasing or decreasing. It’s clear that sometimes returns to more intelligence are very high and increasing—perhaps at some critical intervals in the evolutionary path leading to humans (though likely not in the past \~30k years or so, during which the larger-brained Neanderthals went extinct and H. sapiens brain sizes seem to have slightly declined), and in many competitive domains that involve race-like or winner-take-all dynamics (such as getting to a scientific discovery first, or quantitative finance). Other times, however, there are clear limits to the ability of further intelligence to bring massively greater returns, such as when it’s critical to possess some piece of information or affect some physical process, or when a system is highly chaotic. We are not aware of a general argument in favor of either diminishing or increasing returns to intelligence being the fundamental or important condition. ↩
-
An alternative is that AI races ahead in intelligence but we get a severely robotics-bottlenecked economy, resulting in almost all white-collar jobs disappearing while blue-collar jobs remain for a few more years. This is likely to be equalizing for the wage distribution, but might be destabilizing for soceity at large, especially if you buy Peter Turchin’s model of social instability being driven by elite overproduction. ↩
-
For more on this concept, see our prior work. ↩
-
Note also that humans being extremely good, effective, and cheap at some tasks reduces the incentive for AI companies to get good at those tasks; ideally, we get differential human and AI specialization that results in a long period of human-AI symbiosis, rather than taking the fastest shortcut to human substitution. See J. C. R. Licklider’s 1960 essay Man-Computer Symbiosis for an old example of this vision. ↩
-
See Richard Ngo’s talk on the topic too, which covers related ground. ↩