In a WeChat post on Wednesday morning, the Chinese startup DeepSeek announced the introduction of an improved version of its R1 reasoning AI model on the Hugging Face developer platform. This is an improved AI reasoning model that was surreptitiously published by DeepSeek, a Chinese business that rocked markets this year.
Although the DeepSeek R1 upgrade was made available on the AI model repository Hugging Face, the company did not formally acknowledge it.
According to DeepSeek’s WeChat announcement, the new R1, which is available for commercial usage under a permissive MIT license, is a “minor” change. The Hugging Face repository only includes configuration files and weights, which are the internal parts of a model that determine how it behaves, rather than a description of the model.
With 685 billion parameters, the upgraded R1 is a heavy weight. “Parameters” and “weights” are interchangeable. The model probably cannot operate on consumer-grade hardware without modification.
Following the release of R1 earlier this year, DeepSeek gained notoriety by outperforming OpenAI models. Some US regulators have taken issue with the startup, claiming that DeepSeek’s technology represents a national security threat.
This year, DeepSeek gained notoriety after their open-source, free R1 reasoning model outperformed competing products like Meta and OpenAI. Global markets were taken aback by the low cost and quick development, which raised worries that American tech companies were overspending on infrastructure and depleting the value of significant American tech stocks, such as AI mainstay Nvidia, by billions of dollars. Since then, these businesses have mostly recovered.
Similar to the initial release of DeepSeek R1, the improved variant was similarly introduced with less fanfare. Since it is a reasoning model, the AI can carry out increasingly complex tasks by following a methodical, logical thought process.
On LiveCodeBench, a website that compares models on various metrics, the improved DeepSeek R1 model trails only OpenAI’s o4-mini and o3 reasoning models.
DeepSeek has emerged as a prime example of how, in spite of American efforts to limit China’s access to semiconductors and other technologies, Chinese artificial intelligence is still evolving. Chinese tech behemoths Baidu and Tencent disclosed this month how they were improving the efficiency of their AI models in response to U.S. restrictions on chip exports.
Nvidia’s CEO, Jensen Huang, criticized U.S. export restrictions on Wednesday. Nvidia creates the graphics processing units needed to train massive AI models.
Huang claimed that the United States’ policy is predicated on the idea that China is incapable of producing AI chips. “That assumption was always dubious, and it’s obviously incorrect now.”
Huang went on to say, “The question is not whether China will have AI.” “It does already.”
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.