OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole From Us
-
-
Rob Reiner's dad Carl was best friends with Mel Brooks for almost all of Carl's adult life.
https://www.vanityfair.com/hollywood/2020/06/carl-reiner-mel-brooks-friendship
-
I'm not all that knowledgeable either lol it is my understanding though that what you download, the "model," is the results of their training. You would need some other way to train it. I'm not sure how you would go about doing that though. The model is essentially the "product" that is created from the training.
-
How do you know it isn't communicating with their servers? Obviously it needs internet connection to work, so what's stopping it from sending your data?
-
-
The way they found to train their AI cheaper was to steal it from OpenAI (not that I care). They still need GPUs to process the prompts and generate the responses.
-
It's called distilling the data, to turn the huge amount of data into a compact amount that can be used in another model.
-
-
-
-
Tamaleeeeeeeeesssssss
hot hot hot hot tamaleeeeeeeees
-
-
-
-
-
-
Right—by “take it down” I meant take down online access to their running instance of it.
-
-
You made me look ridiculously stupid and rightfully so. Actually, I take that back, I made myself look stupid and you made it obvious as it gets! Thanks for the wake up call
If I understand correctly, the model is in a way a dictionary of questions with responses, where the journey of figuring out the response is skipped. As in, the answer for the question "What's the point of existence" is "42", but it doesn't contain the thinking process that lead to this result.
If that's so, then wouldn't it be especially prone to hallucinations? I don't imagine it would respond adequately to the third "why?" in the row.
-
To add a tiny bit to what was already explained: you do actually download quite a bit of data to run it locally. The "smaller" 14b model I used was a 9GB download. The 32b one is 20GB and being all text, that's a lot of information.