What Kinds of Data do AI Chatbots Collect?
-
And I can't possibly imagine that Grok actually collects less than ChatGPT.
Also DeepSeek is pretty low on the list despite all the "seeseepee spyware" fear mongering. Also you can run their models locally which doesn't send any data to them at all.
-
Data from surfshark aka nordvpn lol. Take it with a few chunks of salt
Yeah I feel like there's a supposedly missing somewhere. We don't know their servers so at the very least 'user content' is based on trust.
-
Almost none of this data is possible to collect when using Tor Browser
Nope, these services almost always require user login, eventually tied to cell number (ie non disposable) and associate user content and other data points with account. Nonetheless user prompts are always collected. How they're used is a good question.
-
A chart titled "What Kind of Data Do AI Chatbots Collect?" lists and compares seven AI chatbots—Gemini, Claude, CoPilot, Deepseek, ChatGPT, Perplexity, and Grok—based on the types and number of data points they collect as of February 2025. The categories of data include: Contact Info, Location, Contacts, User Content, History, Identifiers, Diagnostics, Usage Data, Purchases, Other Data.
- Gemini: Collects all 10 data types; highest total at 22 data points
- Claude: Collects 7 types; 13 data points
- CoPilot: Collects 7 types; 12 data points
- Deepseek: Collects 6 types; 11 data points
- ChatGPT: Collects 6 types; 10 data points
- Perplexity: Collects 6 types; 10 data points
- Grok: Collects 4 types; 7 data points
anyone whos competent in the matter: what about the french competition chat.mistral.ai
-
Fully agree, which is also why I choose EU/Swiss made services by default
not sure about swiss, they shady as hell if you have scepticism towards rich people greed
-
Nope, these services almost always require user login, eventually tied to cell number (ie non disposable) and associate user content and other data points with account. Nonetheless user prompts are always collected. How they're used is a good question.
Use a third party API. Pay with monero.
-
A chart titled "What Kind of Data Do AI Chatbots Collect?" lists and compares seven AI chatbots—Gemini, Claude, CoPilot, Deepseek, ChatGPT, Perplexity, and Grok—based on the types and number of data points they collect as of February 2025. The categories of data include: Contact Info, Location, Contacts, User Content, History, Identifiers, Diagnostics, Usage Data, Purchases, Other Data.
- Gemini: Collects all 10 data types; highest total at 22 data points
- Claude: Collects 7 types; 13 data points
- CoPilot: Collects 7 types; 12 data points
- Deepseek: Collects 6 types; 11 data points
- ChatGPT: Collects 6 types; 10 data points
- Perplexity: Collects 6 types; 10 data points
- Grok: Collects 4 types; 7 data points
Gemini: "Other Data"
Like, what's fucking left!?
-
A chart titled "What Kind of Data Do AI Chatbots Collect?" lists and compares seven AI chatbots—Gemini, Claude, CoPilot, Deepseek, ChatGPT, Perplexity, and Grok—based on the types and number of data points they collect as of February 2025. The categories of data include: Contact Info, Location, Contacts, User Content, History, Identifiers, Diagnostics, Usage Data, Purchases, Other Data.
- Gemini: Collects all 10 data types; highest total at 22 data points
- Claude: Collects 7 types; 13 data points
- CoPilot: Collects 7 types; 12 data points
- Deepseek: Collects 6 types; 11 data points
- ChatGPT: Collects 6 types; 10 data points
- Perplexity: Collects 6 types; 10 data points
- Grok: Collects 4 types; 7 data points
Wow, it’s a whole new level of f*cked up when Zuck collects more data than the Winnie the Pooh (DeepSeek).
-
Wow, it’s a whole new level of f*cked up when Zuck collects more data than the Winnie the Pooh (DeepSeek).
The idea that US apps are somehow better than Chinese apps when it comes to collecting and selling user data is complete utter propaganda.
-
not sure about swiss, they shady as hell if you have scepticism towards rich people greed
I’m only referring to data privacy laws.
-
anyone whos competent in the matter: what about the french competition chat.mistral.ai
+1 for Mistral, they were the first (or one of the first) Apache open source licensed models. I run Mistral-7B and variant fine tunes locally, and they've always been really high quality overall. Mistral-Medium packed a punch (mid-size obviously) but it definitely competes with the big ones at least.
-
Isn't deepseek better for that?
In my experience it depends on the math. Every model seems to have different strengths based on a wide berth of prompts and information.
-
Are there tutorials on how to do this? Should it be set up on a server on my local network??? How hard is it to set up? I have so many questions.
https://ollama.ai/, this is what I've been using for over a year now, new models come out regularly and you just "ollama pull <model ID>" and then it's available to run locally. Then you can use docker to run https://www.openwebui.com/ locally, giving it a ChatGPT-style interface (but even better and more configurable and you can run prompts against any number of models you select at once.)
All free and available to everyone.
-
The idea that US apps are somehow better than Chinese apps when it comes to collecting and selling user data is complete utter propaganda.
Do use either. Until Trump, I still considered CCP spyware more dangerous because they would be collecting info that could be used to blackmail US politicians and businesses. Now, it's a coin flip. In either case, use EU or FOSS apps whenever possible.
-
Use a third party API. Pay with monero.
Yes it is possible to create disposable-isque api keys for different uses. The monetary cost is the cost of privacy and of not having hardware to run things locally.
If you have reliable privacy friendly api vendor suggestions then do share. While I do not need such services now, it can a good future reference.
-
Yes it is possible to create disposable-isque api keys for different uses. The monetary cost is the cost of privacy and of not having hardware to run things locally.
If you have reliable privacy friendly api vendor suggestions then do share. While I do not need such services now, it can a good future reference.
I think I only used chatgpt once to play around, and it was one of those. I dont remember the name, sorry