What DeepSeek r1 Means—and What It Doesn’t

China’s AI breakthrough doesn’t invalidate America’s export controls—but it does expose real weaknesses in its AI policy.

On Jan. 20, the Chinese AI company DeepSeek released a language model called r1, and the AI community (as measured by X, at least) has talked about little else since. The model is the first to publicly match the performance of OpenAI’s frontier “reasoning” model, o1—beating frontier labs Anthropic, Google’s DeepMind, and Meta to the punch. The model matches, or comes close to matching, o1 on benchmarks like GPQA (graduate-level science and math questions), AIME (an advanced math competition), and Codeforces (a coding competition).

What’s more, DeepSeek released the “weights” of the model (though not the data used to train it) and released a detailed technical paper showing much of the methodology needed to produce a model of this caliber—a practice of open science that has largely ceased among American frontier labs (with the notable exception of Meta). As of Jan. 26, the DeepSeek app had risen to number one on the Apple App Store’s list of most downloaded apps, just ahead of ChatGPT and far ahead of competitor apps like Gemini and Claude.

Alongside the main r1 model, DeepSeek released smaller versions (“distillations”) that can be run locally on reasonably well-configured consumer laptops (rather than in a large data center). And even for the versions of DeepSeek that run in the cloud, the cost for the largest model is 27 times lower than the cost of OpenAI’s competitor, o1.

DeepSeek accomplished this feat despite U.S. export controls on the high-end computing hardware necessary to train frontier AI models (graphics processing units, or GPUs). While we do not know the training cost of r1, DeepSeek claims that the language model used as the foundation for r1, called v3, cost $5.5 million to train. It’s worth noting that this is a measurement of DeepSeek’s marginal cost and not the original cost of buying the compute, building a data center, and hiring a technical staff. Nonetheless, it remains an impressive figure.

After nearly two-and-a-half years of export controls, some observers expected that Chinese AI companies would be far behind their American counterparts. As such, the new r1 model has commentators and policymakers asking if American export controls have failed, if large-scale compute matters at all anymore, if DeepSeek is some kind of Chinese espionage or propaganda outlet, or even if America’s lead in AI has evaporated. All the uncertainty caused a broad selloff of tech stocks on Monday, Jan. 27, with AI chipmaker Nvidia’s stock falling 17%.

The answer to these questions is a decisive no, but that does not mean there is nothing important about r1. To be able to consider these questions, though, it is necessary to cut away the hyperbole and focus on the facts.

What Are DeepSeek and r1?

DeepSeek is a quirky company, having been founded in May 2023 as a spinoff of the Chinese quantitative hedge fund High-Flyer. The fund, like many trading firms, is a sophisticated user of large-scale AI systems and computing hardware, employing such tools to execute arcane arbitrages in financial markets. These organizational competencies, it turns out, translate well to training frontier AI systems, even under the tough resource constraints any Chinese AI firm faces.

DeepSeek’s research papers and models have been well regarded within the AI community for at least the past year. The company has released detailed papers (itself increasingly rare among American frontier AI firms) demonstrating clever methods of training models and generating synthetic data (data created by AI models, often used to bolster model performance in specific domains). The company’s consistently high-quality language models have been darlings among fans of open-source AI. Just last month, the company showed off its third-generation language model, called simply v3, and raised eyebrows with its exceptionally low training budget of only $5.5 million (compared to training costs of tens or hundreds of millions for American frontier models).

But the model that truly garnered global attention was r1, one of the so-called reasoners. When OpenAI showed off its o1 model in September 2024, many observers assumed OpenAI’s advanced methodology was years ahead of any foreign competitor’s. This, however, was a mistaken assumption.

The o1 model uses a reinforcement learning algorithm to teach a language model to “think” for longer periods of time. While OpenAI did not document its methodology in any technical detail, all signs point to the breakthrough having been relatively simple. The basic formula appears to be this: Take a base model like GPT-4o or Claude 3.5; place it into a reinforcement learning environment where it is rewarded for correct answers to complex coding, scientific, or mathematical problems; and have the model generate text-based responses (called “chains of thought” in the AI field). If you give the model enough time (“test-time compute” or “inference time”), not only will it be more likely to get the right answer, but it will also begin to reflect and correct its mistakes as an emergent phenomena.

As DeepSeek itself helpfully puts it in the r1 paper:

One of the most remarkable aspects of this self-evolution is the emergence of sophisticated behaviors as the test-time computation increases. Behaviors such as reflection—where the model revisits and reevaluates its previous steps—and the exploration of alternative approaches to problem-solving arise spontaneously. These behaviors are not explicitly programmed but instead emerge as a result of the model’s interaction with the reinforcement learning environment.

In other words, with a well-designed reinforcement learning algorithm and sufficient compute devoted to the response, language models can simply learn to think. This staggering fact about reality—that one can replace the very difficult problem of explicitly teaching a machine to think with the much more tractable problem of scaling up a machine learning model—has garnered little attention from the business and mainstream press since the release of o1 in September. If it does anything else, r1 stands a chance at waking up the American policymaking and commentariat class to the profound story that is rapidly unfolding in AI.

What’s more, if you run these reasoners millions of times and select their best answers, you can create synthetic data that can be used to train the next-generation model. In all likelihood, you can also make the base model larger (think GPT-5, the much-rumored successor to GPT-4), apply reinforcement learning to that, and produce an even more sophisticated reasoner. Some combination of these and other tricks explains the massive leap in performance of OpenAI’s announced-but-unreleased o3, the successor to o1. This model, which should be released within the next month or so, can solve questions meant to flummox doctorate-level experts and world-class mathematicians. OpenAI researchers have set the expectation that a similarly rapid pace of progress will continue for the foreseeable future, with releases of new-generation reasoners as often as quarterly or semiannually. On the current trajectory, these models may surpass the very top of human performance in some areas of math and coding within a year.

Impressive though it all may be, the reinforcement learning algorithms that get models to reason are just that: algorithms—lines of code. You do not need massive amounts of compute, particularly in the early stages of the paradigm (OpenAI researchers have compared o1 to 2019’s now-primitive GPT-2). You simply need to discover knowledge, and discovery can be neither export controlled nor monopolized. Viewed in this light, it is no surprise that the world-class team of researchers at DeepSeek found a similar algorithm to the one employed by OpenAI. Public policy can diminish Chinese computing power; it cannot weaken the minds of China’s finest researchers.

Implications of r1 for U.S. Export Controls

Counterintuitively, though, this does not mean that U.S. export controls on GPUs and semiconductor manufacturing equipment are no longer relevant. In fact, the opposite is true. First of all, DeepSeek acquired a large number of Nvidia’s A800 and H800 chips—AI computing hardware that matches the performance of the A100 and H100, which are the chips most commonly used by American frontier labs, including OpenAI.

The A/H-800 variants of these chips were made by Nvidia in response to a flaw in the 2022 export controls, which allowed them to be sold into the Chinese market despite coming very close to the performance of the very chips the Biden administration intended to control. Thus, DeepSeek has been using chips that very closely resemble those used by OpenAI to train o1.

This flaw was corrected in the 2023 controls, but the new generation of Nvidia chips (the Blackwell series) has only just begun to ship to data centers. As these newer chips propagate, the gap between the American and Chinese AI frontiers could widen yet again. And as these new chips are deployed, the compute requirements of the inference scaling paradigm are likely to increase rapidly; that is, running the proverbial o5 will be far more compute intensive than running o1 or o3. This, too, will be an impediment for Chinese AI firms, because they will continue to struggle to get chips in the same quantities as American firms.

Even more important, though, the export controls were always unlikely to stop an individual Chinese company from making a model that reaches a specific performance benchmark. Model “distillation”—using a larger model to train a smaller model for much less money—has been common in AI for years. Say that you train two models—one small and one large—on the same dataset. You’d expect the larger model to be better. But somewhat more surprisingly, if you distill a small model from the larger model, it will learn the underlying dataset better than the small model trained on the original dataset. Fundamentally, this is because the larger model learns more sophisticated “representations” of the dataset and can transfer those representations to the smaller model more readily than a smaller model can learn them for itself. DeepSeek’s v3 frequently claims that it is a model made by OpenAI, so the chances are strong that DeepSeek did, indeed, train on OpenAI model outputs to train their model.

Instead, it is more appropriate to think of the export controls as attempting to deny China an AI computing ecosystem. The benefit of AI to the economy and other areas of life is not in creating a specific model, but in serving that model to millions or billions of people around the world. This is where productivity gains and military prowess are derived, not in the existence of a model itself. In this way, compute is a bit like energy: Having more of it almost never hurts. As innovative and compute-heavy uses of AI proliferate, America and its allies are likely to have a key strategic advantage over their adversaries.

Export controls are not without their risks: The recent “diffusion framework” from the Biden administration is a dense and complex set of rules intended to regulate the global use of advanced compute and AI systems. Such an ambitious and far-reaching move could easily have unintended consequences—including making Chinese AI hardware more appealing to countries as diverse as Malaysia and the United Arab Emirates. Right now, China’s domestically produced AI chips are no match for Nvidia and other American offerings. But this could easily change over time. If the Trump administration maintains this framework, it will have to carefully evaluate the terms on which the U.S. offers its AI to the rest of the world.

The U.S. Strategic Gaps Exposed by DeepSeek: Open-Weight AI

While the DeepSeek news may not signal the failure of American export controls, it does highlight shortcomings in America’s AI strategy. Beyond its technical prowess, r1 is notable for being an open-weight model. That means that the weights—the numbers that define the model’s functionality—are available to anyone in the world to download, run, and modify for free. Other players in Chinese AI, such as Alibaba, have also released well-regarded models as open weight.

The only American company that releases frontier models this way is Meta, and it is met with derision in Washington just as often as it is applauded for doing so. Last year, a bill called the ENFORCE Act—which would have given the Commerce Department the authority to ban frontier open-weight models from release—nearly made it into the National Defense Authorization Act. Prominent, U.S. government-funded proposals from the AI safety community would have similarly banned frontier open-weight models, or given the federal government the power to do so.

Open-weight AI models do present novel risks. They can be freely modified by anyone, including having their developer-made safeguards removed by malicious actors. Right now, even models like o1 or r1 are not capable enough to allow any truly dangerous uses, such as executing large-scale autonomous cyberattacks. But as models become more capable, this may begin to change. Until and unless those capabilities manifest themselves, though, the benefits of open-weight models outweigh their risks. They allow businesses, governments, and individuals more flexibility than closed-source models. They allow researchers around the world to investigate safety and the inner workings of AI models—a subfield of AI in which there are currently more questions than answers. In some highly regulated industries and government activities, it is practically impossible to use closed-weight models due to restrictions on how data owned by those entities can be used. Open models could be a long-term source of soft power and global technology diffusion. Right now, the United States only has one frontier AI company to answer China in open-weight models.

The Looming Threat of a State Regulatory Patchwork

Even more troubling, though, is the state of the American regulatory ecosystem. Currently, analysts expect as many as one thousand AI bills to be introduced in state legislatures in 2025 alone. Several hundred have already been introduced. While many of these bills are anodyne, some create onerous burdens for both AI developers and corporate users of AI.

Chief among these are a suite of “algorithmic discrimination” bills under debate in at least a dozen states. These bills are a bit like the EU’s AI Act, with its risk-based and paperwork-heavy approach to AI regulation. In a signing statement last year for the Colorado version of this bill, Gov. Jared Polis bemoaned the legislation’s “complex compliance regime” and expressed hope that the legislature would improve it this year before it goes into effect in 2026.

The Texas version of the bill, introduced in December 2024, even creates a centralized AI regulator with the power to create binding rules to ensure the “ethical and responsible deployment and development of AI”—essentially, anything the regulator wishes to do. This regulator would be the most powerful AI policymaking body in America—but not for long; its mere existence would almost surely trigger a race to legislate among the states to create AI regulators, each with their own set of rules. After all, for how long will California and New York tolerate Texas having more regulatory muscle in this domain than they have? America is sleepwalking into a state patchwork of vague and varying laws.

Conclusion

While DeepSeek r1 may not be the omen of American decline and failure that some commentators are suggesting, it and models like it herald a new era in AI—one of faster progress, less control, and, quite possibly, at least some chaos. While some stalwart AI skeptics remain, it is increasingly expected by many observers of the field that exceptionally capable systems—including ones that outthink humans—will be built soon. Without a doubt, this raises profound policy questions—but these questions are not about the efficacy of the export controls.

America still has the opportunity to be the global leader in AI, but to do that, it must also lead in answering these questions about AI governance. The candid reality is that America is not on track to do so. Indeed, we appear to be on track to follow in the footsteps of the European Union—despite many people even in the EU believing that the AI Act went too far. But the states are charging ahead nonetheless; without federal action, they will set the foundation of American AI policy within a year. If state policymakers fail in this task, the hyperbole about the end of American AI dominance may start to be a bit more realistic.

– Dean Woodley Ball is a Research Fellow in the Artificial Intelligence & Progress Project at George Mason University’s Mercatus Center and author of Hyperdimensional. His work focuses on emerging technologies and the future of governance. He has written on topics including artificial intelligence, neural technology, bioengineering, technology policy, political theory, public finance, urban infrastructure, and prisoner re-entry. Published courtesy of Lawfare.