DeepSeek Proves It: Open Source is the Secret to Dominating Tech Markets (and Wall Street has it wrong).
-
well if they really are and methodology can be replicated, we are surely about to see some crazy number of deepseek comptention, cause imagine how many us companies in ai and finance sector that are in posession of even larger number of chips.
Although the question rises - if the methodology is so novel why would these folks make it opensource? Why would they share results of years of their work to the public losing their edge over competition? I dont understand.
Can somebody who actually knows how to read machine learning codebase tell us something about deepseek after reading their code?
-
Hugging face already reproduced deepseek R1 (called Open R1) and open sourced the entire pipeline
-
Did they? According to their repo its still WIP
https://github.com/huggingface/open-r1 -
They are trying to make it accepted but it's still contested. Unless the training data provided it's not really open.
-
The training corpus of these large models seem to be “the internet YOLO”. Where it’s fine for them to download every book and paper under the sun, but if a normal person does it.
Believe it or not:
-
But then, people would realize that you got copyrighted material and stuff from pirating websites...
-
Sounds legit
-
DeepSeek shook the AI world because it’s cheaper, not because it’s open source.
And it’s not really open source either. Sure, the weights are open, but the training materials aren’t. Good look looking at the weights and figuring things out.
-
I wouldn’t call it the accepted terminology at all. Just because some rich assholes try to will it into existence doesnt mean we have to accept it.
-
It’s time for you to do some serious self-reflection about the inherent biases you believe about Asians.
-
We already have all the evidence. This isn’t some developing story, the paper is reproducible. What’s dehumanizing is assuming that Asians can’t make good software.
-
Wall Street’s panic over DeepSeek is peak clown logic—like watching a room full of goldfish debate quantum physics. Closed ecosystems crumble because they’re built on the delusion that scarcity breeds value, while open source turns scarcity into oxygen. Every dollar spent hoarding GPUs for proprietary models is a dollar wasted on reinventing wheels that the community already gave away for free.
The Docker parallel is obvious to anyone who remembers when virtualization stopped being a luxury and became a utility. DeepSeek didn’t “disrupt” anything—it just reminded us that innovation isn’t about who owns the biggest sandbox, but who lets kids build castles without charging admission.
Governments and corporations keep playing chess with AI like it’s a Cold War relic, but the board’s already on fire. Open source isn’t a strategy—it’s gravity. You don’t negotiate with gravity. You adapt or splat.
Cheap reasoning models won’t kill demand for compute. They’ll turn AI into plumbing. And when’s the last time you heard someone argue over who owns the best pipe?
-
Can you point out any factual inaccuracies or is it just that your wittew fee-fees got hurt?
-
I disagree with you, links are not that long to share. It is a bit more time consuming obviously, but everyone can choose whether to read quickly or really dive in the sources. I see a lot of people doing it today on internet. I see a lot of people doing it in casual conversation (opening a book or internet to check smthg). It's not evidence, it's hints to avoid launching a whole discussion that entirely lies or bullshit (or not).
Here are some links I found about smuggled chips.
- Reuters : Deepseek said they used legally imported old and new nvidia chips (H800 and H20s). There are suspicions and investigations about illegal smuggling of banned from export nvidia chips, targeting directly Deepseek. One CEO of an american AI startup said it is likely Deepseek used smuggled chips.
- The Diplomat : exactly the same, citing directly Reuters. Adds that H800 (now banned from export) and H20s were designed by Nvidia specially for the chinese market. Adds that smuggling could go through Singapore, which leaped from 9% to 22% of Nvidia revenues in 2 years. Nvidia and Singapore representatives deny.
- Foxbusiness : same.
So it is likely there are smuggled chips in china if we believe this. Now to say they have been used by Deepseek and even more, that they have been decisive is still very unclear.
-
WTF dude. You mentioned Asia. I love Asians. Asia is vast. There are many countries, not just China bro. I think you need to do these reflections.
Im talking about very specific case of Chinese Deepseek devs potentiall lying about the chips. The assumptions and generalizations you are thinking of are crazy. -
And how do your feelings stand up to the fact that independent researchers find the paper to be reproducible?
-
the accepted terminology
No, it isn't. The OSI specifically requires the training data be available or at very least that the source and fee for the data be given so that a user could get the same copy themselves. Because that's the purpose of something being "open source". Open source doesn't just mean free to download and use.
https://opensource.org/ai/open-source-ai-definition
Data Information: Sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system. Data Information shall be made available under OSI-approved terms.
In particular, this must include: (1) the complete description of all data used for training, including (if used) of unshareable data, disclosing the provenance of the data, its scope and characteristics, how the data was obtained and selected, the labeling procedures, and data processing and filtering methodologies; (2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.
As per their paper, DeepSeek R1 required a very specific training data set because when they tried the same technique with less curated data, they got R"zero' which basically ran fast and spat out a gibberish salad of English, Chinese and Python.
People are calling DeepSeek open source purely because they called themselves open source, but they seem to just be another free to download, black-box model. The best comparison is to Meta's LlaMa, which weirdly nobody has decided is going to up-end the tech industry.
In reality "open source" is a terrible terminology for what is a very loose fit when basically trying to say that anyone could recreate or modify the model because they have the exact 'recipe'.
-
Well maybe. Apparntly some folks are already doing that but its not done yet. Let's wait for the results. If everything is legit we should have not one but plenty of similar and better models in near future. If Chinese did this with 100 chips imagine what can be done with 100000 chips that nvidia can sell to a us company
-
So are these techiques so novel and breaktrough?
The general concept, no. (it's reinforcement learning, something that's existed for ages)
The actual implementation, yes. (training a model to think using a separate XML section, reinforcing with the highest quality results from previous iterations using reinforcement learning that naturally pushes responses to the highest rewarded outputs) Most other companies just didn't assume this would work as well as throwing more data at the problem.
This is actually how people believe some of OpenAI's newest models were developed, but the difference is that OpenAI was under the impression that more data would be necessary for the improvements, and thus had to continue training the entire model with additional new information, whereas DeepSeek decided to simply scrap that part altogether and go solely for reinforcement learning.
Will we now have a burst of deepseek like models everywhere?
Probably, yes. Companies and researchers are already beginning to use this same methodology. Here's a writeup about S1, a model that performs up to 27% better than OpenAI's best model. S1 used Supervised Fine Tuning, and did something so basic, that people hadn't previously thought to try it: Just making the model think longer by modifying terminating XML tags.
This was released days after R1, based on R1's initial premise, and creates better quality responses. Oh, and of course, it cost $6 to train.
So yes, I think it's highly probable that we see a burst of new models, or at least improvements to existing ones. (Nobody has a very good reason to make a whole new model of a different name/type when they can simply improve the one they're already using and have implemented)
-