DeepSeek, a Chinese artificial
intelligence (AI) start-up, released its latest version, DeepSeek-R1, on
January 20, 2025. It quickly became the most-downloaded free app on the Apple
Store surprising even ChatGPT. This
release has indeed stunned investors and tech giants in Silicon Valley.
President Trump referred to DeepSeek’s
success as a “wake-up call” for US companies, emphasizing the need to focus on
“competing to win”. Not much is known about the people behind DeepSeek except
that it was founded in May 2023 by Liang Wenfeng, who graduated from Zhejiang
University with degrees in electronic information engineering and computer
science. He has now become an international celebrity due to the rapid rise of DeepSeek.
Now the question is: What made DeepSeek
so special? The answer is simple: Using fewer advanced chips, DeepSeek claims
to have trained its model for $6 mn, which is significantly lower than the $100
mn reportedly spent by Silicon Valley companies like OpenAI. It achieved this
cost efficiency by using smart optimizations and focusing on training only the
necessary parts of the model— a move from large models to a small model. This
free AI-powered Chatbot is an advanced large language model designed with
enhanced reasoning and analytical capabilities. It indeed focuses on reasoning
rather than just generating responses based on existing data. It offers its
models under open-source license, making them freely available to users.
Though it looks, feels, and
works very much like ChatGPT; unlike it, DeepSeek R1 costs just $0.55 per
million input tokens, as against $15 per million input tokens charged by OpenAI.
Over it, DeepSeek demonstrated superior performance in coding tasks achieving a
97% success rate. Using a multi-token prediction approach, DeepSeek is
predicting several pieces of information at once, and thus delivering its
responses faster and more accurately.
Indeed, it is reported to be as
powerful as the recently released OpenAI’s 01 model in executing tasks
including mathematics and coding. Secondly, for the first time, a Chinese
company joined the Silicon Valley companies as an innovator rather than just as
their follower.
Thus, DeepSeek has proved that
cutting-edge AI models can be developed with limited computing resources. Its
hybrid architecture and accessibility across devices made it popular overnight
globally. It excels over ChatGPT at carrying out technical tasks such as coding
and logical problem-solving. It positioned itself among the techies as a
transformative force in AI development.
Nevertheless, ChatGPT is known
for its “versatility, user-friendly design and strong contextual
understanding”, a feature that enables it to offer broader adaptability across
industries. As against this, DeepSeek focuses mainly on technical applications.
At the same time, both tools face challenges such as biases in training data
and deployment demands.
That said, one must admit that
DeepSeek owes much of its success to Google’s initial 2017 transformer
architecture and OpenAI’s 01 model released in September 2024. DeepSeek R1—as
stated in a paper by its researchers—was trained on many synthetic data
questions and answers generated with OpenAI’s GPT-40 model. It is no
exaggeration to say that without GPT-40 DeepSeek would not have developed such
influential tool. It is this benefit that perhaps allowed DeepSeek to create an
AI product at such a low cost. Of course, it is common in the Tech world to
copy the previously available product and build a new product with better
features.
Although DeepSeek boasts
impressive “chain of thought reasoning” and efficiency in text and mathematical
tasks, it falls behind ChatGPT in several features. For instance, DeepSeek
allows users to upload photos and file attachments but can only extract text
using age-old optical character recognition (OCR). In vision capabilities, it
pales in comparison to ChatGPT, which can analyze images, provide descriptions and
even offer further information based on user prompts. Additionally, DeepSeek lacks
a voice interaction mode, whereas ChatGPT supports natural conversational
interactions.
Intriguingly, release of
DeepSeek—with the said cost-advantages and free access to its model weights and
outputs, which fact empowers developers to build on its technology—made
chip-making giant Nvidia shed almost $600 bn of its market value on Monday, the
biggest one-day loss in US history. Microsoft (-3.31%), MetaPlatforms (-31%)
and Alphabat (-2.40) too lost heavily in their market capitalization.
That aside, DeepSeek’s success challenges
America’s efforts to contain China by imposing restrictions on the export of
advanced chips. As venture capitalist, Marc Andreessen remarked, “DeepSeek R1
is AI’s Sputnik moment” — reminiscent of the space-race between the erstwhile
USSR and the US— DeepSeek might similarly spark an intense race, this time
between China and the US.
However, one must wait for the
full details to emerge to form a more meaningful view.
**
No comments:
Post a Comment