The story of DeepSeek: How chinese AI shook Silicon Valley's hegemony

The rise of DeepSeek is undoubtedly one of the most fascinating chapters in the recent history of Artificial Intelligence. In just two years, this Chinese company has shaken the hegemony of Silicon Valley, proving that cutting-edge innovation does not necessarily require unlimited resources, but rather a strategy of formidable engineering efficiency.

Here is the saga of DeepSeek, from its roots in quantitative finance to its current status as a global “disrupter.”

1. The origins: From trading to AI

Unlike OpenAI with ChatGPT (born from a research lab) or Google Gemini (an evolution of a web giant), DeepSeek finds its roots in high finance.

The founder: The company was launched in early 2023 by Liang Wenfeng, a brilliant and discreet engineer who founded High-Flyer Quant.
The DNA: High-Flyer is one of China’s largest quantitative hedge funds. For years, they used AI to predict market movements. Consequently, they already possessed a massive computing infrastructure (the Firefly supercomputer) and a culture of extreme mathematical optimization.

In creating DeepSeek-AI, Liang Wenfeng didn’t just want to build a chatbot; he wanted to apply the rigor of financial analysis to the design of Large Language Models (LLMs).

2. The strategy of “Less but Better”

While American giants were engaged in a race for gigantism (more GPUs, more data, more electricity), DeepSeek took the opposite path: algorithmic efficiency.

MoE Innovation (Mixture of Experts)

One of DeepSeek’s greatest strengths was the early adoption and refinement of the Mixture of Experts architecture. Instead of activating the entire neural network to answer a simple question, the model only activates a small fraction of its “experts.”

The result: DeepSeek managed to maintain high-level performance while drastically reducing training and inference costs.

3. The breakthrough: The DeepSeek-V Series

DeepSeek’s history is marked by the release of models at a breakneck pace, each reaching a new symbolic milestone.

DeepSeek-V1 & V2: These early models established the company’s credibility. DeepSeek-V2, in particular, shocked the open-source community with its coding and mathematical performance, rivaling GPT-4 while being significantly “lighter.”
DeepSeek-Coder: The company quickly specialized in the field of computer programming. Their models became favorites among developers worldwide, often judged more precise than paid Silicon Valley solutions for generating Python or C++.

4. DeepSeek-V3: The “Sputnik Moment”

In late 2024, the release of DeepSeek-V3 marked a historical turning point. For the first time, a Chinese “open-weights” model (where the weights are publicly accessible) equaled or surpassed the best closed models like Claude 3.5 Sonnet or GPT-4o.

Why was this historic?

Training cost: DeepSeek revealed they trained V3 for approximately $6 million. By comparison, it is estimated that training top-tier American models costs hundreds of millions, if not billions, of dollars.
Technological independence: Despite US restrictions on the export of high-end Nvidia chips (like the H100), DeepSeek proved that brilliant software optimization could compensate for limited access to top-tier hardware.

5. The R1 Revolution: Pure Reasoning

The most recent masterstroke is the launch of DeepSeek-R1. This model uses reinforcement learning to “think” before responding—a technique similar to OpenAI’s o1 model.

Chain of thought: DeepSeek-R1 has the unique feature of showing its “reasoning process” in real-time.
Performance: It is capable of solving complex mathematical problems and logic puzzles with staggering precision, all while remaining free to use or extremely cheap for developers to integrate.

6. Geopolitical and Economic Impact

Today, DeepSeek is no longer just an AI company; it is a symbol.

Democratization: By offering its models as open-weights, DeepSeek allows any small business to run world-class AI on its own servers.
Wall Street shockwaves: In early 2025, DeepSeek’s success caused tremors in financial markets, as investors began to question the profitability of massive capital expenditures (Capex) by American giants in the face of the Chinese model’s insolent efficiency.

The incredible story of DeepSeek AI

The story of DeepSeek is that of an outsider who changed the rules of the game. By proving that high-level AI is not just a matter of raw power but above all mathematical finesse, DeepSeek forced the entire industry to rethink the future of the field.

The message is clear: in the race for intelligence, it is not necessarily the one with the biggest engine who wins, but the one who knows how to best optimize the trajectory.