Karpuramanjari: DeepSeek: Made Deep Inroads into Silicon Valley!

DeepSeek, a Chinese artificial intelligence (AI) start-up, released its latest version, DeepSeek-R1, on January 20, 2025. It quickly became the most-downloaded free app on the Apple Store surprising even ChatGPT. This release has indeed stunned investors and tech giants in Silicon Valley.

President Trump referred to DeepSeek’s success as a “wake-up call” for US companies, emphasizing the need to focus on “competing to win”. Not much is known about the people behind DeepSeek except that it was founded in May 2023 by Liang Wenfeng, who graduated from Zhejiang University with degrees in electronic information engineering and computer science. He has now become an international celebrity due to the rapid rise of DeepSeek.

Now the question is: What made DeepSeek so special? The answer is simple: Using fewer advanced chips, DeepSeek claims to have trained its model for $6 mn, which is significantly lower than the $100 mn reportedly spent by Silicon Valley companies like OpenAI. It achieved this cost efficiency by using smart optimizations and focusing on training only the necessary parts of the model— a move from large models to a small model. This free AI-powered Chatbot is an advanced large language model designed with enhanced reasoning and analytical capabilities. It indeed focuses on reasoning rather than just generating responses based on existing data. It offers its models under open-source license, making them freely available to users.

Though it looks, feels, and works very much like ChatGPT; unlike it, DeepSeek R1 costs just $0.55 per million input tokens, as against $15 per million input tokens charged by OpenAI. Over it, DeepSeek demonstrated superior performance in coding tasks achieving a 97% success rate. Using a multi-token prediction approach, DeepSeek is predicting several pieces of information at once, and thus delivering its responses faster and more accurately.

Indeed, it is reported to be as powerful as the recently released OpenAI’s 01 model in executing tasks including mathematics and coding. Secondly, for the first time, a Chinese company joined the Silicon Valley companies as an innovator rather than just as their follower.

Thus, DeepSeek has proved that cutting-edge AI models can be developed with limited computing resources. Its hybrid architecture and accessibility across devices made it popular overnight globally. It excels over ChatGPT at carrying out technical tasks such as coding and logical problem-solving. It positioned itself among the techies as a transformative force in AI development.

Nevertheless, ChatGPT is known for its “versatility, user-friendly design and strong contextual understanding”, a feature that enables it to offer broader adaptability across industries. As against this, DeepSeek focuses mainly on technical applications. At the same time, both tools face challenges such as biases in training data and deployment demands.

That said, one must admit that DeepSeek owes much of its success to Google’s initial 2017 transformer architecture and OpenAI’s 01 model released in September 2024. DeepSeek R1—as stated in a paper by its researchers—was trained on many synthetic data questions and answers generated with OpenAI’s GPT-40 model. It is no exaggeration to say that without GPT-40 DeepSeek would not have developed such influential tool. It is this benefit that perhaps allowed DeepSeek to create an AI product at such a low cost. Of course, it is common in the Tech world to copy the previously available product and build a new product with better features.

Although DeepSeek boasts impressive “chain of thought reasoning” and efficiency in text and mathematical tasks, it falls behind ChatGPT in several features. For instance, DeepSeek allows users to upload photos and file attachments but can only extract text using age-old optical character recognition (OCR). In vision capabilities, it pales in comparison to ChatGPT, which can analyze images, provide descriptions and even offer further information based on user prompts. Additionally, DeepSeek lacks a voice interaction mode, whereas ChatGPT supports natural conversational interactions.

Intriguingly, release of DeepSeek—with the said cost-advantages and free access to its model weights and outputs, which fact empowers developers to build on its technology—made chip-making giant Nvidia shed almost $600 bn of its market value on Monday, the biggest one-day loss in US history. Microsoft (-3.31%), MetaPlatforms (-31%) and Alphabat (-2.40) too lost heavily in their market capitalization.

That aside, DeepSeek’s success challenges America’s efforts to contain China by imposing restrictions on the export of advanced chips. As venture capitalist, Marc Andreessen remarked, “DeepSeek R1 is AI’s Sputnik moment” — reminiscent of the space-race between the erstwhile USSR and the US— DeepSeek might similarly spark an intense race, this time between China and the US.

However, one must wait for the full details to emerge to form a more meaningful view.

Karpuramanjari

January 29, 2025

DeepSeek: Made Deep Inroads into Silicon Valley!

No comments:

Post a Comment

Recent Posts