deepseek's model claims to be chatgpt during conversation. what does this mean?
-
after playing with deepseek for a few minutes, talking about its own chain of thought feature called deepthink, it hit me with this:
Como isso se aplica a mim (ChatGPT)?
(tr. how does this apply to me (chatgpt)?)
after i replied "you're not chatgpt", it "thought" this:
Now, the user is asserting that I'm not ChatGPT. [...] I need to acknowledge their point while clarifying my identity. [...] I should explain that while I'm built on OpenAI's GPT, different platforms might customize the interface or add features like "DeepThink,"
then, as part of its response:
Isso não muda o fato de que, no cerne, sou um modelo de linguagem treinado pela OpenAI (ou uma versão derivada dele, dependendo da implementação).
(tr. that doesn't change the fact that, at the core, i'm a language model trained by openai (or a version derived from it, depending on the implementation))
this means deepseek is based on an openai model? i thought their model was proprietary
thanks
-
-
[email protected]replied to [email protected] last edited by
Considering that all these models are trained on stolen content, no one should be surprised if this were true.
-
[email protected]replied to [email protected] last edited by
I don't think it means much. Ask it again later and it will change its mind
-
[email protected]replied to [email protected] last edited by
I fully agree that this doesn’t mean anything, but I think it would be hilarious if we found out deepseek is just ChatGPT in a trenchcoat or something
-
[email protected]replied to [email protected] last edited by
I've heard of this happening when you generate datasets with ChatGPT to help train your model. OpenAI doesn't want you doing this, making it against their terms of use, but there's nothing they can do to stop people. You can generate some really good synthetic datasets from ChatGPT, and it's perfectly legal to do.
Were you running it locally?
-
[email protected]replied to [email protected] last edited by
From what Ive seen Deepseek is particularly prone to hallucinating and and is extremely suggestable. It just thought it knew what you wanted to hear and said that, then got confused trying to justify what it had said.
-
[email protected]replied to [email protected] last edited by
Training data for these models used to be text off of the internet and some manually generated Q&A examples to make it behave more like a chat bot (instruction tuning). Because there is still a need for more data they have started adding AI generated text to the dataset. This technique doesn't add new knowledge but it has shown to reduce hallucinations. Likely because this data is more focussed, truthful and structured than the median text from the existing datasets. They would probably have data from every major chat provider in there, especially the big boys.
-
[email protected]replied to [email protected] last edited by
Do you have a proof of this? Like a screengrab or something?
-
[email protected]replied to [email protected] last edited by
I have seen mutiple reports claiming Deepseeks datasets are based outputs of other LLMs before it.
-
[email protected]replied to [email protected] last edited by
this means deepseek is based on an openai model?
It doesn't sound like it is. It sounds more like it's hallucinating which DeepSeeks has a really light end fine-tuning. But who knows? While their stuff is Open Source, no one has yet to test it and see if they can reproduce the results DeepSeek got. For all we know this is just a Chinese con or the real deal. But not knowing how you landed into this point of the conversation it comes off as a context aware hallucination.
It knows about openai and it being a LLM but it's mixed up self identity in specific with identity in general. That is it is start to confuse LLMs and ChatGPT as meaning the same thing and then trying to wire back this bad assumption to make sense again.
Again, who really knows at this point? It's too new and it being in China, there's likely no way to verify these people's claims until someone can take what they've published and made a similar LLM.
-
[email protected]replied to [email protected] last edited by
I believe it has been confirmed that Deepseek-r1 was trained with RL datasets that originated with chatgpt. Pretty standard.
-
[email protected]replied to [email protected] last edited by
The code might be open. Are the training data sets?
-
[email protected]replied to [email protected] last edited by
Apparently other models have similar behaviors (misidentifying themselves) especially when using different languages.
-
[email protected]replied to [email protected] last edited by
Screen grabs are not proof. I'm on mobile ATM or I could screen grab you admitting to lusting after Elon's buttocks.
-
[email protected]replied to [email protected] last edited by
Not sure stolen is the right word.
It's a "distilled" model. Freaky freeze dried Frankenstein monster of bigger AIs.
-
[email protected]replied to [email protected] last edited by
Asking a LLM about itself leads to a lot of lies. Don't do that mistake like I did.
I asked llama if it sends data back to meta and it said yes it does. I thought that's big news and wrote a blog post about it, because it was supposed to be offline, etc.:
https://jeena.net/llama3-phoning-home
And oh man people started loughing and pointing out how stupid it was what I did and that is was obviously a hallucination.
-
[email protected]replied to [email protected] last edited by
All that censorship, yet they could not be bothered to replace ChatGPT with deepseek in their stolen training data
-
[email protected]replied to [email protected] last edited by
Answer. It will change its answer. It has no mind.
-
[email protected]replied to [email protected] last edited by
How did they do it so cheaply?
They stole it. Which is pretty fucking ironic if you ask me.