Science - Technology

DeepSeek 'cost over $1 billion instead of $5.6 million for AI'

TH (according to VnExpress) February 3, 2025 09:52

DeepSeek is estimated to have spent around $1.6 billion to develop its AI models, rather than the less than $6 million the company claimed.

deep-seek-ton-tien.jpg
DeepSeek software interface with company logo

According to an analysis by SemiAnalysis, a well-known semiconductor and AI market research and consulting firm, DeepSeek’s hardware spending alone was “well in excess of $500 million.” In addition, generating the synthetic data to train the model required “a significant amount of computing power.” The $5.6 million figure covers only the training costs, not the costs of research, development, data preparation, hardware maintenance, and other related issues.

“Our analysis suggests that the total server capital expenditure could be $1.6 billion, with a significant $944 million associated with operating these clusters,” SemiAnalysis said. “They also have to experiment, come up with new architectures, collect and clean data, pay staff, and more.”

Going into more detail, SemiAnalysis found that DeepSeek may have had access to around 10,000 Nvidia H800 GPUs that were throttled to comply with the US AI chip ban on China, and around 10,000 H100 GPUs. Additionally, the company may have used the H20 version to train its models.

“These GPUs are shared between the High-Flyer investment fund and DeepSeek, distributed geographically to some extent. They are used for trading, inference, training, and research,” according to SemiAnalysis.

DeepSeek has been posting a lot of AI talent hunts lately, and has also been holding regular recruitment events at top Chinese universities. The company has been touting “unlimited access to 10,000 GPUs” and is said to be offering salaries of over $1.3 million a year to some promising candidates, much higher than those offered by major Chinese tech companies and global AI labs like Moonshot.

“To be clear, DeepSeek remains unique and ahead of the curve when it comes to achieving cost optimization for powerful AI models,” SemiAnalysts stressed, adding that DeepSeek R1 is a “very good model” and that catching up to the global frontier of AI reasoning so quickly is “truly impressive.”

DeepSeek has not yet commented.

In its previous announcements, DeepSeek also did not provide an overall figure, beyond the $5.576 million, which was mainly for renting AI servers and “formal training” of the models. That is, this figure does not take into account research and experiments related to architecture, algorithms, or data.

Previously, some experts also commented that DeepSeek had figured out how to optimize training its AI model at a low cost, but $6 million is not really the final number. Yann LeCun, Director of AI at Meta, said there was a "big misunderstanding" when comparing American companies spending billions of dollars on AI with DeepSeek.

“There is a big misunderstanding about AI infrastructure investment. The vast majority of those billions of dollars are invested in infrastructure for ‘inference,’ not training,” LeCun wrote on the social network Threads last week. “Running an AI assistant service for billions of people is computationally intensive. When you add video understanding, reasoning, large-scale storage, and other capabilities to an AI system, the cost of inference increases. So the market reaction to DeepSeek is ill-founded.”

Thomas Sohmers, founder of AI hardware startup Positron, agrees with LeCun that inference will make up a larger portion of AI infrastructure spending. “The demand for inference and infrastructure spending is going to grow rapidly,” he toldBusiness Insider. "In the future, as popularity increases, DeepSeek will have to handle more requests, forcing it to spend more money on inference."

Talk toCNBC, Alexandr Wang, CEO of Scale AI, also revealed that he has information that DeepSeek owns 50,000 Nvidia H100 chips. "However, due to US export controls, DeepSeek cannot make this public," Wang said. Elon Musk, founder of xAI and close to President Donald Trump, agreed: "It is obvious."

To keep costs low, the company is also said to have used a “distillation” technique. OpenAI toldFTthat there were signs of “distillation” that they suspected from DeepSeek. This technique is used by developers to achieve better performance on small models, by using output from large models, allowing them to get similar results on specific tasks at a lower cost.

DeepSeek was founded in May 2023 by Liang Wenfeng, headquartered in Hangzhou, Zhejiang, and owned by High-Flyer Investment Fund. The company is funded by High-Flyer and has no plans to raise capital, focusing on building its underlying technology.

TH (according to VnExpress)
(0) Comments
Latest News
DeepSeek 'cost over $1 billion instead of $5.6 million for AI'