Pirate Libraries Are Forbidden Fruit for AI Companies. But at What Cost?
-
This post did not contain any content.
-
T [email protected] shared this topic
-
[email protected]replied to [email protected] last edited by
Nothing. It was pirated for free.
-
[email protected]replied to [email protected] last edited by
Bibliotik baybeeeee
-
[email protected]replied to [email protected] last edited by
“We cleaned 860K English and 180K Chinese e-books from Anna’s Archive,” a DeepSeek VL paper, published last March, states.
Hmm.
-
[email protected]replied to [email protected] last edited by
The future of AI innovation may hinge on the outcome of a global copyright debate.
Meh, US is not the world.
-
[email protected]replied to [email protected] last edited by
Some have allegedly paid.
“We’ve provided about 20-30 companies/teams with our entire dataset. It’s the same data as on our torrents page, but they get access to high-speed SFTP servers.”
“Usually, this is in exchange for a large monetary donation or, on occasion, in exchange for good datasets they acquired,” ‘Anna’s Archivist’ adds, noting that all data they obtain is shared publicly.
-
[email protected]replied to [email protected] last edited by
The fact that Anna's Archive is accepting additional datasets as "payment" makes me comfortable that they're not in this for the money but rather for ideological reasons.
-
[email protected]replied to [email protected] last edited by
Yeah, information wants to be free. I'd say we just do away with copyright /s
Or I could try training AI as well once this is settled. Of course I'd need to get a few big harddrives to store a few books, audiobooks, misic, Netflix series...
-
[email protected]replied to [email protected] last edited by
Honestly, this is the best thing about the AI hype.
Remember to support your local (shadow) library!
-
[email protected]replied to [email protected] last edited by
Or it could be that such trade wouldn’t have to appear in accounting