DeepSeek collects keystroke data and more, storing it in Chinese servers
-
[email protected]replied to [email protected] last edited by
Not in the way you think. They aren't constantly training when interacting, that would be way more inefficient than what US AI companies have been doing.
It might be added to the training data, but a lot of training data now is apparently synthetic and generated by other models because while you might get garbage, it gives more control over the type of data and shape it takes, which makes it more efficient to train for specific domains.
-
[email protected]replied to [email protected] last edited by
It doesn't. They run using stuff like Ollama or other LLM tools, all of the hobbyist ones are open source. All the model is is the inputs, node weights and connections, and outputs.
LLMs, or neural nets at large, are kind of a "black box" but there's no actual code that gets executed from the model when you run them, it's just processed by the host software based on the rules for how these work. The "black box" part is mostly because these are so complex we don't actually know exactly what it is doing or how it output answers, only that it works to a degree. It's a digital representation of analog brains.
People have also been doing a ton of hacking at it, retraining, and other modifications that would show anything like that if it could exist.
-
[email protected]replied to [email protected] last edited by
What would you have preferred? "Most apps sell your data, news at 11"? Would anyone care if it was written like that?
-
[email protected]replied to [email protected] last edited by
It's a chinese company, where else would they store the data?
-
[email protected]replied to [email protected] last edited by
Sure, but its open source and doesn't upload it anywhere. Also doesn't have internet permission
-
[email protected]replied to [email protected] last edited by
and that's what superpowers do, but living in a third world country i'm yet to see the chinese putsch us as the u.s. did during the cold war and beyond, with all due consequences. sorry about my lack of goodwill towards the department of state.
-
[email protected]replied to [email protected] last edited by
Oh yeah, the whole article could be reductively summed up as
"DeepSeek and all the other LLM services are almost as bad as each other, but we think deepseek is worse....because the Chinese government are known for doing bad things".
The title is factual, if a little clickbaity.
Obviously keystrokes you submit to a website are submitted to the website.
This though, it's not technically accurate, a lot of forms and input are done client side and then the resulting information is parceled up and sent to the server.
The actual keystroke data isn't normally sent.
Though this article doesn't go in to what kind of keystroke data is sent, if it was something more than just which keys in which order then that's perhaps an indicator that it's actively being collected for a reason, rather than just incidentally.
If you want to get really paranoid about such things it's known that you can you can do interesting things with actual keystroke data.
Also, afaict none of the the non-chinese services have specified that they don't do this.
-
[email protected]replied to [email protected] last edited by
I shouldn't have anything to hide, but I'm part of a group the current fascist leadership in government want's to eradicate, so hide I shall.
That said, I also feel like people acting like the remote server they are connected to is tracking what you do on it as some kind of surprise is so stupid. "Facebook is keeping track of the pictures I uploaded to it!!!!" There's a lot of stuff to complain about Facebook, google, or whoever, but them tracking stuff you send to them willingly isn't one of them.
-
[email protected]replied to [email protected] last edited by
Isn't it open source? If so it should be near trivial to get rid of all of that.
If it's closed source I wouldn't touch it with a tej foot pole, it's the same reason I rarely use chat gpt, it's just freely giving away your personal data to open AI.
-
[email protected]replied to [email protected] last edited by
The runner is open source, and that's what matter in this discussion. If you host the model on your own servers, you can ensure that no corporation (american or Chinese) has access to your data. Access to the training code and data is irrelevant here.
-
[email protected]replied to [email protected] last edited by
HuggingChat is open source and lets you use DeepSeek.
Very misleading, it lets you use the lighter, watered-down version (Deepseek 32B) compared to the large impressive model they have (Deepseek 671B)
-
[email protected]replied to [email protected] last edited by
I'm confused. Isn't "collecting keystroke data" just an alarmist way to describe text entry?
-
[email protected]replied to [email protected] last edited by
Lmaooooo great find. I wonder why exactly they had to clarify that? Maybe a semi Easter egg? Or a genuine concern? Thanks for sharing.
-
[email protected]replied to [email protected] last edited by
Yes, and all evil is secretly him. Just like you say, secretly he's working with everyone bad in the USA, and he's secretly a very important figure in US politics. Just like he's secretly behind me stubbing my toe.
-
[email protected]replied to [email protected] last edited by
So you were lying when you said you couldn't see them, because you've replied to ones that actively refute you statements.
-
[email protected]replied to [email protected] last edited by
I shouldn’t have anything to hide, but I’m part of a group the current fascist leadership in government want’s to eradicate, so hide I shall.
I agree and i think a lot of people who espouse "nothing to hide" as an approach haven't actually thought it all the way through.
Then there's the fascists, dictators, oligarchs and other all around shitbags who just want the control.
That said, I also feel like people acting like the remote server they are connected to is tracking what you do on it as some kind of surprise is so stupid. “Facebook is keeping track of the pictures I uploaded to it!!!” There’s a lot of stuff to complain about Facebook, google, or whoever, but them tracking stuff you send to them willingly isn’t one of them.
This always surprises me, i originally thought it was because people didn't understand how these things work or how capitalist companies work.
More and more it seems like people don't care until it affects them, which is somewhat understandable, it takes effort to care about this stuff and a lot of people will never be directly affected by the consequences.
What i do still think is that the general population has no idea the extent of what can be done with all of the information they are volunteering.
That's very slowly changing but the usages of the data are also increasing at a much more rapid pace than before.
-
[email protected]replied to [email protected] last edited by
The funny thing is that I would realistically only care about, for example, the Russian government collecting my data if their oligarchy collaborated with my government's oligarchy against my and the population's interest (which I guess in this case is significantly more likely than China)
-
[email protected]replied to [email protected] last edited by
Excellent Point. If that's the case though, then wouldn't other countries follow suit which still limits big tech's reach and makes them less profitable and less powerful? Idk. Guess we'll see how it plays out. Either way, I'm staying as far from those ecosystems as possible to at least try to mitigate some of what they do. I'll never be totally successful, genie is put of the bottle, but we can at least attempt.
-
[email protected]replied to [email protected] last edited by
They both can and frequently do influence the information you are exposed to on social media to influence your decision making.
You know what they say about assertions made without evidence.
-
[email protected]replied to [email protected] last edited by
No, I haven't.