As DeepSeek is now “widely” used.
This week, Chinese AI lab DeepSeek gained widespread attention when its chatbot software topped the charts on the Apple software Store (and Google Play, too). Wall Street analysts and technologists are now wondering if the United States can continue to lead the AI race and if there would be a sustained demand for AI chips as a result of DeepSeek’s AI models, which were developed utilizing compute-efficient methods.
However, where did DeepSeek originate, and how did it become so well-known throughout the world so fast?
The roots of DeepSeek’s traders
High-Flyer Capital Management, a Chinese quantitative hedge fund that leverages AI to guide its trading choices, supports DeepSeek.
In 2015, Liang Wenfeng, an AI enthusiast, co-founded High-Flyer. According to reports, Wenfeng started experimenting with trading while attending Zhejiang University. In 2019, he established High-Flyer Capital Management, a hedge fund dedicated to creating and implementing AI algorithms.
DeepSeek was established by High-Flyer in 2023 as a facility devoted to studying AI tools apart from its financial operations. The lab split out into its own business, DeepSeek, with High-Flyer as one of its investors.
DeepSeek created its own data center clusters for model training right away. However, DeepSeek has been impacted by U.S. hardware export restrictions, just like other AI firms in China. The company was compelled to employ Nvidia H800 processors, a less potent variant of the H100 chip that is accessible to American businesses, in order to train one of its more current models.
It is stated that the technological staff at DeepSeek is primarily young. According to reports, the corporation actively seeks out PhD AI researchers from prestigious Chinese universities. According to The New York Times, DeepSeek also employs non-computer scientists to help its tech better understand a variety of topics.
The robust models of DeepSeek
In November 2023, DeepSeek released its initial set of models, which included DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat. However, the AI industry didn’t start paying attention until this spring, when the startup unveiled its next-generation DeepSeek-V2 family of models.
In addition to doing well on a number of AI benchmarks, DeepSeek-V2, a general-purpose text and picture analysis system, was far less expensive to operate than similar models at the time. It compelled ByteDance and Alibaba, two of DeepSeek’s domestic rivals, to lower the usage fees for some of their models and make others totally free.
The December 2024 release of DeepSeek-V3 only increased DeepSeek’s reputation.
DeepSeek V3 performs better than both “closed” models that are only accessible via an API, such as OpenAI’s GPT-4o, and downloadable, publicly available models, such as Meta’s Llama, according to DeepSeek’s internal benchmark testing.
The R1 “reasoning” model of DeepSeek is equally outstanding. According to DeepSeek’s January release, R1 outperforms OpenAI’s o1 model on important metrics.
R1 successfully fact-checks itself since it is a reasoning model, which helps it stay clear of some of the common mistakes that models make. In comparison to a standard non-reasoning model, reasoning models typically take a little longer to arrive at solutions, ranging from seconds to minutes. On the plus side, they are typically more trustworthy in fields like math, science, and physics.
However, R1, DeepSeek V3, and the other DeepSeek models have drawbacks. Since the AI was created in China, China’s internet regulator is able to benchmark it to make sure that its responses “embody core socialist values.” For instance, R1 in DeepSeek’s chatbot software won’t respond to inquiries concerning Taiwan’s autonomy or Tiananmen Square.
DeepSeek received more than 16.5 million visits in March. According to David Carr, editor at Similarweb, “[F]or March, DeepSeek is in second place, despite seeing traffic drop 25% from where it was in February,” he told TechCrunch. ChatGPT, which surpassed 500 million weekly active users in March, is still far superior.
An improved version of DeepSeek’s R1 reasoning AI model was made available on the Hugging Face developer platform in May.
A method that is disruptive and It’s unclear exactly what DeepSeek’s business model is, if it has one. The business offers certain of its goods and services for free while pricing others far below market value. Even though there is a lot of VC interest, it is not accepting investor funds.
According to DeepSeek, it has been able to sustain exceptional cost competitiveness through efficiency advancements. However, several experts contest the numbers provided by the corporation.
In any event, developers have embraced DeepSeek’s models, which are accessible under permissive licenses that permit commercial use but aren’t open source in the traditional sense of the word. Clem Delangue, the CEO of Hugging Face, one of the platforms that houses DeepSeek’s models, claims that over 500 “derivative” models of R1 have been developed on Hugging Face and have received a total of 2.5 million downloads.
DeepSeek’s triumph over bigger and more well-established competitors has been characterized as “over-hyped” and “upending AI.” The company’s performance was at least partially to blame for the 18% decline in Nvidia’s stock price in January and for prompting OpenAI CEO Sam Altman to address the public. According to Reuters, U.S. Commerce department bureaus informed employees in March that DeepSeek would not be allowed on their official devices.
DeepSeek’s “excellent innovation” was highlighted by CEO Jensen Huang during Nvidia’s fourth-quarter results call. Huang said that Nvidia benefits greatly from DeepSeek and other “reasoning” models because they require a lot more computation.
Meanwhile, governments and entire nations, including South Korea, are outlawing DeepSeek, as are certain businesses. DeepSeek’s use on government equipment was likewise prohibited by New York state.
In a Senate hearing in May, Microsoft president and vice chairman Brad Smith stated that DeepSeek is prohibited for usage by Microsoft employees because of worries about propaganda and data security.
Better models are inevitable, but the U.S. government seems to be becoming more cautious of what it views as detrimental foreign influence. In March, The Wall Street Journal claimed that the U.S. is likely to ban DeepSeek on government computers. It is unclear what the future holds for DeepSeek.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.