Example Image
Civitas Outlook
Topic
Economic Dynamism
Published on
Feb 6, 2025
Contributors
Rachel Lomasky
(Shutterstock)

AI Disrupts Itself, Expect More of This

Contributors
Rachel Lomasky
Rachel Lomasky
Rachel Lomasky
Summary
DeepSeek echoes a popular pattern in technology: the incumbent building bigger and more expensive tech, only to be disrupted by someone more clever.
Summary
DeepSeek echoes a popular pattern in technology: the incumbent building bigger and more expensive tech, only to be disrupted by someone more clever.

Besides all the societal disruptions AI has caused, AI has disrupted itself. The new DeepSeek R1 model roughly matches the performance of the previous leaders but differs in multiple dimensions: Chinese vs. American, open source vs. proprietary, state-of-the-art hardware vs. almost state-of-the-art hardware, and an order-of-magnitude difference in the training cost. In an industry using brute force to power its way to the front, the nimbler competitor training smarter rather than harder is the best. While the DeepSeek researchers have been publishing papers about their methods for years, the robust capacity of the model surprised the world and caused a fair bit of hysteria. Many investors had bet that proprietary models would dominate the market, trained on mass amounts of expensive, specialized hardware, and the market quickly reflected their loss. However, for society, the emergence of a powerful, relatively inexpensive open source model allays the concerns about companies with proprietary LLMs dominating the market and using their power to stifle smaller competitors, limit access to generative AI models, and charge monopoly prices.

This echoes a popular pattern in technology: the incumbent building bigger and more expensive tech, only to be disrupted by someone more clever. For example, Yahoo had an enormous platform built on powerful and expensive servers, constantly being upgraded to more powerful versions. Many saw their lead as impenetrable because of the enormous cost of acquiring comparable hardware to compile the vast indices for a search engine. Then Google swooped in with both a superior algorithm and the ability to use less specialized hardware, both of which were easier to scale. Switching costs for search engines are quite low, like those for generative AI models, and Yahoo quickly lost its dominance. The replacement of large and expensive mainframes with the personal computer is similar.

The R1 advantage is mainly a function of the leanness of its training process. Instead of explicitly programmed rules, it uses a reinforcement learning technique to guide itself and be more autodidactic. This rewards it for reaching desirable end states rather than telling it exactly how to get there. Rather than one large, monolithic model, DeepSeek uses a Mixture of Experts architecture, in which an ensemble of specialized models each train on a subset of the data, learning that subject area in great detail. When a user queries the model, a router sends it to the right expert. Intuitively, using a monolithic base model is like relying on the smartest person in the room to answer every question. By contrast, the Mixture of Experts architecture is analogous to directing queries to a panel of specialists, directing a query to a doctor, mechanic, librarian, etc., as appropriate. This is more efficient in training and query answering processes, handling more complex problems with improved accuracy and less training data. Proprietary models ( e.g. OpenAI’s ChatGPT-4) also employ Mixture of Experts architecture. However, DeepSeek is claiming that they spent orders of magnitude less money training their model. DeepSeek used cheaper hardware, but the widely reported $6 million cost only reflects pre-training costs, excluding the costs of research, testing, salaries, and running the hardware. When these are added in, the total tops $100 million. It’s likely cheaper, but not drastically.  However, it follows a pattern that building on existing tech is usually faster and more efficient. Knowing where previous efforts failed can help companies avoid development dead ends, leaping over those with a first mover disadvantage.

DeepSeek’s R1 model is certainly not the first open source model to perform relatively well, but others have not quite measured up to the proprietary models on industry standard benchmarks. While there is nothing inherently superior about proprietary models, they are very expensive to build because of the costs of acquiring training data, buying the specialized hardware and computing power to train them, and salaries for the highly skilled engineers involved. Companies tend not to want to share something they invested so much money in. However, the proprietary vs. open source gap has been closing over the past two years, and it is not surprising that an open source model is the current leader. A vast community of researchers and software developers, including many academics, enthusiastically contribute to open source – not just improving the model’s performance but shining a light on bugs, code quality issues, and security vulnerabilities. They are from diverse backgrounds, enriching software with their particular expertise, e.g., expanding language support. The community's contribution and review of the code foster trust in the model. Additionally, open source models can be customized and extended for additional domains, use cases, and model behaviors. This avoids the technical and legal barriers to improving and extending proprietary models. 

For users, accessing R1 to answer queries is significantly less expensive than using its proprietary competitors, although their prices have fallen drastically in the last year. Users can also host their copy without paying for the model, just the hardware to do inference to answer the incoming queries with R1. The reduced price allows smaller organizations to access the model, including individuals, startups, non-profits, and academics. It’s hard to foresee the full effects of this increased accessibility to cutting-edge models on society. However, in previous cases where open source software has matched proprietary technologies, it has fueled massive innovation and empowered entire ecosystems. Open source software such as Linux (Unix operating system), PostgresSQL (relational database management system), Python (programming language), and TensorFlow (machine model framework) have spawned massive ecosystems because they are more adaptable and transparent and, thus, can be extended more easily. Additionally, open source technologies often lead to voluntary industry standards, increasing interoperability and accelerating technological development.

DeepSeek is a Chinese company, causing concerns that the Chinese government will have access to user queries. While this is likely true for the particular copy of the model hosted by DeepSeek, users can host their own, perhaps even on “air gapped” hardware not connected to the internet. Users explicitly enter queries into the tool even for the hosted version, avoiding tacit background data surveillance associated with phone applications. Users may not think about their privacy when they ask, but they know their data is being collected (as it is in American models). However, as people tend to put their innermost and most private thoughts into their search engine queries, this will likely not stop them. Also, there are likely to be biases (read censorship) in R1’s answers. Proprietary American models also have these types of filters, e.g. censoring answers that would be racist or lead to inexpert medical advice. One should be aware when using any model that biases (and hallucinations) may distort the answers.

R1 is an open source algorithm, which means that DeepSeek’s competitors can build on it, too. In a month, or perhaps sooner, another company will likely release an open source model that duplicates this feat and probably exceeds its accuracy and efficiency. The media will probably have a panic attack about that, too, although probably less so if it’s not Chinese.

Rachel Lomasky is Chief Data Scientist at Flux, a company that helps organizations do responsible AI. She has helped scientists train and productionalize their machine learning algorithms throughout her career.

00
1x
10:13
More articles

Four Questions and Few Answers About the Invasion Clause

Constitutionalism
Feb 13, 2025

Symposium: What Does the Middle East Mean to US?

Politics
Feb 13, 2025
View all

Join the newsletter

Receive new publications, news, and updates from the Civitas Institute.

Sign up
More on

Economic Dynamism

Why Failure-to-Market Claims Are Preempted Under Federal Law

A California appellate court invented out of whole cloth a new and troubling theory of tort liability.

Richard Epstein, Benjamin Flowers
Economic Dynamism
Feb 5, 2025
Dynamism and Stagnation: An Outlook

Flexibility and responsiveness are particularly important during periods of shock.

Ed Glaeser
Economic Dynamism
Jan 8, 2025
Recent Patterns of New Business Creation

The pandemic surge of new business entries highlights American entrepreneurial dynamism.

Ryan Decker
Economic Dynamism
Dec 10, 2024
The Housing and Migration Crisis

Challenges to housing affordability have undermined economic mobility and dynamism.

Dan Shoag
Economic Dynamism
Dec 10, 2024
No items found.
These Mayors Understand How to Run a City

Armed with common sense policies, three urban leaders are fighting a patient battle against chaos.

Joel Kotkin
Economic Dynamism
Jan 24, 2025
Saving Free Markets in America

American markets are under siege by interventionists on both the left and the right.

Samuel Gregg, Richard M. Reinsch II
Economic Dynamism
Jan 3, 2025
The Presidential Solution to Taming the Growing Federal Debt

Only presidential power can tame America's massive federal debt.

Andrew Johnston
Economic Dynamism
Dec 23, 2024
The World War II Lesson for DOGE

A Trump revolution is poised to unleash the innovative and productive power of the private sector.

Arthur Herman
Economic Dynamism
Dec 10, 2024

Edward Glaeser on Dynamism and Stagnation

Economic Dynamism
May 8, 2024
1:05

Dynamism & Its Enemies: 2024 Austin Symposium Recap

Economic Dynamism
May 8, 2024
1:05

Nobel Laureate Edmund Phelps on What Makes Nations Prosper

Economic Dynamism
May 8, 2024
1:05

Deirdre McCloskey on Where Prosperity Comes From

Economic Dynamism
May 8, 2024
1:05

Daniel Shoag on the Housing and Migration Challenge

Economic Dynamism
May 8, 2024
1:05
No items found.
No items found.
A New Course Is Needed for the Steel Industry

It’s time for the steel industry and the government to get a divorce.

Veronique de Rugy
Economic Dynamism
Feb 12, 2025
The EU Imposes Its ESG Agenda on American Companies

The goal in negotiating with the EU should be to secure the freer flow of goods, services, and investment, something in the interests of both America and the EU.

Michael Toth
Economic Dynamism
Feb 4, 2025
The Path Dependency of American Health Care

Health care policies promote the chance to acquire winning lottery tickets, offering greater health care services and product availability (subsidized by third parties) rather than better assurances of those resources being delivered and used more productively.

Thomas Miller
Economic Dynamism
Jan 29, 2025
Should We Believe the Economic Data or Americans’ “Lyin’” Eyes? The Answer Is Yes.

Many Americans are convinced that the economy is ailing and that life is financially tougher today than a decade—or a generation—ago.

Scott Winship
Economic Dynamism
Jan 16, 2025
No items found.