DeepSeek Proves It: Open Source is the Secret to Dominating Tech Markets (and Wall Street has it wrong).
-
WTF dude. You mentioned Asia. I love Asians. Asia is vast. There are many countries, not just China bro. I think you need to do these reflections.
Im talking about very specific case of Chinese Deepseek devs potentiall lying about the chips. The assumptions and generalizations you are thinking of are crazy. -
And how do your feelings stand up to the fact that independent researchers find the paper to be reproducible?
-
the accepted terminology
No, it isn't. The OSI specifically requires the training data be available or at very least that the source and fee for the data be given so that a user could get the same copy themselves. Because that's the purpose of something being "open source". Open source doesn't just mean free to download and use.
https://opensource.org/ai/open-source-ai-definition
Data Information: Sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system. Data Information shall be made available under OSI-approved terms.
In particular, this must include: (1) the complete description of all data used for training, including (if used) of unshareable data, disclosing the provenance of the data, its scope and characteristics, how the data was obtained and selected, the labeling procedures, and data processing and filtering methodologies; (2) a listing of all publicly available training data and where to obtain it; and (3) a listing of all training data obtainable from third parties and where to obtain it, including for fee.
As per their paper, DeepSeek R1 required a very specific training data set because when they tried the same technique with less curated data, they got R"zero' which basically ran fast and spat out a gibberish salad of English, Chinese and Python.
People are calling DeepSeek open source purely because they called themselves open source, but they seem to just be another free to download, black-box model. The best comparison is to Meta's LlaMa, which weirdly nobody has decided is going to up-end the tech industry.
In reality "open source" is a terrible terminology for what is a very loose fit when basically trying to say that anyone could recreate or modify the model because they have the exact 'recipe'.
-
Well maybe. Apparntly some folks are already doing that but its not done yet. Let's wait for the results. If everything is legit we should have not one but plenty of similar and better models in near future. If Chinese did this with 100 chips imagine what can be done with 100000 chips that nvidia can sell to a us company
-
So are these techiques so novel and breaktrough?
The general concept, no. (it's reinforcement learning, something that's existed for ages)
The actual implementation, yes. (training a model to think using a separate XML section, reinforcing with the highest quality results from previous iterations using reinforcement learning that naturally pushes responses to the highest rewarded outputs) Most other companies just didn't assume this would work as well as throwing more data at the problem.
This is actually how people believe some of OpenAI's newest models were developed, but the difference is that OpenAI was under the impression that more data would be necessary for the improvements, and thus had to continue training the entire model with additional new information, whereas DeepSeek decided to simply scrap that part altogether and go solely for reinforcement learning.
Will we now have a burst of deepseek like models everywhere?
Probably, yes. Companies and researchers are already beginning to use this same methodology. Here's a writeup about S1, a model that performs up to 27% better than OpenAI's best model. S1 used Supervised Fine Tuning, and did something so basic, that people hadn't previously thought to try it: Just making the model think longer by modifying terminating XML tags.
This was released days after R1, based on R1's initial premise, and creates better quality responses. Oh, and of course, it cost $6 to train.
So yes, I think it's highly probable that we see a burst of new models, or at least improvements to existing ones. (Nobody has a very good reason to make a whole new model of a different name/type when they can simply improve the one they're already using and have implemented)
-
-
the accepted terminology nowadays
Let's just redefine existing concepts to mean things that are more palatable to corporate control why don't we?
If you don't have the ability to build it yourself, it's not open source. Deepseek is "freeware" at best. And that's to say nothing of what the data is, where it comes from, and the legal ramifications of using it.
-
-
-
-
I think it's both. OpenAI was valued at a certain point because of a perceived moat of training costs. The cheapness killed the myth, but open sourcing it was the coup de grace as they couldn't use the courts to put the genie back into the bottle.
-
True, but I'm of the belief that we'll probably see a continuation of the existing trend of building and improving upon existing models, rather than always starting entirely from scratch. For instance, you'll almost always see nearly any newly released model talk about the performance of their Llama version, because it just produces better results when you combine it with the existing quality of Llama.
I think we'll see a similar trend now, just with R1 variants instead of Llama variants being the primary new type used. It's just fundamentally inefficient to start over from scratch every time, so it makes sense that newer iterations would be built directly on previous ones.
-
The model weights and research paper are
I think you're conflating "open source" with "free"
What does it even mean for a research paper to be open source? That they release a docx instead of a pdf, so people can modify the formatting? Lol
The model weights were released for free, but you don't have access to their source, so you can't recreate them yourself. Like Microsoft Paint isn't open source just because they release the machine instructions for free. Model weights are the AI equivalent of an exe file. To extend that analogy, quants, LORAs, etc are like community-made mods.
To be open source, they would have to release the training data and the code used to train it. They won't do that because they don't want competition. They just want to do the facebook llama thing, where they hope someone uses it to build the next big thing, so that facebook can copy them and destroy them with a much better model that they didn't release, force them to sell, or kill them with the license.
-
There's so much misinfo spreading about this, and while I don't blame you for buying it, I do blame you for spreading it. "It sounds legit" is not how you should decide to trust what you read. Many people think the earth is flat because the conspiracy theories sound legit to them.
DeepSeek probably did lie about a lot of things, but their results are not disputed. R1 is competitive with leading models, it's smaller, and it's cheaper. The good results are definitely not from "sheer chip volume and energy used", and American AI companies could have saved a lot of money if they had used those same techniques.
-
Ah, cool, a new account to block.
-
I hate to disagree but IIRC deepseek is not a open-source model but open-weight?
-
-
Governments and corporations still use the same playbooks because they're still oversaturated with Boomers who haven't learned a lick since 1987.
-
Idk, I kind of disagree with some of their updates at least in the UI department.
-
Not exactly sure of what "dominating" a market means, but the title is on a good point: innovation requires much more cooperation than competition. And the 'AI race' between nations is an antiquated mainframe pushed by media.