Pirate Libraries Are Forbidden Fruit for AI Companies. But at What Cost?
-
This post did not contain any content.
-
-
[email protected]replied to [email protected] last edited by
Nothing. It was pirated for free.
-
[email protected]replied to [email protected] last edited by
Bibliotik baybeeeee
-
[email protected]replied to [email protected] last edited by
“We cleaned 860K English and 180K Chinese e-books from Anna’s Archive,” a DeepSeek VL paper, published last March, states.
Hmm.
-
[email protected]replied to [email protected] last edited by
The future of AI innovation may hinge on the outcome of a global copyright debate.
Meh, US is not the world.
-
[email protected]replied to [email protected] last edited by
Some have allegedly paid.
“We’ve provided about 20-30 companies/teams with our entire dataset. It’s the same data as on our torrents page, but they get access to high-speed SFTP servers.”
“Usually, this is in exchange for a large monetary donation or, on occasion, in exchange for good datasets they acquired,” ‘Anna’s Archivist’ adds, noting that all data they obtain is shared publicly.