deepseek's model claims to be chatgpt during conversation. what does this mean?
-
[email protected]replied to [email protected] last edited by
Do you have a proof of this? Like a screengrab or something?
-
[email protected]replied to [email protected] last edited by
I have seen mutiple reports claiming Deepseeks datasets are based outputs of other LLMs before it.
-
[email protected]replied to [email protected] last edited by
this means deepseek is based on an openai model?
It doesn't sound like it is. It sounds more like it's hallucinating which DeepSeeks has a really light end fine-tuning. But who knows? While their stuff is Open Source, no one has yet to test it and see if they can reproduce the results DeepSeek got. For all we know this is just a Chinese con or the real deal. But not knowing how you landed into this point of the conversation it comes off as a context aware hallucination.
It knows about openai and it being a LLM but it's mixed up self identity in specific with identity in general. That is it is start to confuse LLMs and ChatGPT as meaning the same thing and then trying to wire back this bad assumption to make sense again.
Again, who really knows at this point? It's too new and it being in China, there's likely no way to verify these people's claims until someone can take what they've published and made a similar LLM.
-
[email protected]replied to [email protected] last edited by
I believe it has been confirmed that Deepseek-r1 was trained with RL datasets that originated with chatgpt. Pretty standard.
-
[email protected]replied to [email protected] last edited by
The code might be open. Are the training data sets?
-
[email protected]replied to [email protected] last edited by
Apparently other models have similar behaviors (misidentifying themselves) especially when using different languages.
-
[email protected]replied to [email protected] last edited by
Screen grabs are not proof. I'm on mobile ATM or I could screen grab you admitting to lusting after Elon's buttocks.
-
[email protected]replied to [email protected] last edited by
Not sure stolen is the right word.
It's a "distilled" model. Freaky freeze dried Frankenstein monster of bigger AIs.
-
[email protected]replied to [email protected] last edited by
Asking a LLM about itself leads to a lot of lies. Don't do that mistake like I did.
I asked llama if it sends data back to meta and it said yes it does. I thought that's big news and wrote a blog post about it, because it was supposed to be offline, etc.:
https://jeena.net/llama3-phoning-home
And oh man people started loughing and pointing out how stupid it was what I did and that is was obviously a hallucination.
-
[email protected]replied to [email protected] last edited by
All that censorship, yet they could not be bothered to replace ChatGPT with deepseek in their stolen training data
-
[email protected]replied to [email protected] last edited by
Answer. It will change its answer. It has no mind.
-
[email protected]replied to [email protected] last edited by
How did they do it so cheaply?
They stole it. Which is pretty fucking ironic if you ask me.