OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work

[email protected]

This post did not contain any content.

[email protected]

Piracy is not theft.

[email protected]

Piracy is only theft if AI can't be made profitable.

[email protected]

When a corporation does it to get a competitive edge, it is.

[email protected]

Obligatory: I'm anti-AI, mostly anti-technology

That said, I can't say that I mind LLMs using copyrighted materials that it accesses legally/appropriately (lots of copyrighted content may be freely available to some extent, like news articles or song lyrics)

I'm open to arguments correcting me. I'd prefer to have another reason to be against this technology, not arguing on the side of frauds like Sam Altman. Here's my take:

All content created by humans follows consumption of other content. If I read lots of Vonnegut, I should be able to churn out prose that roughly (or precisely) includes his idiosyncrasies as a writer. We read more than one author; we read dozens or hundreds over our lifetimes. Likewise musicians, film directors, etc etc.

If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?

? Offline

Oh it's "over"? Fine for me

[email protected]

In your example, you could also be sued for ripping off his style.

[email protected]

"Your proposal is acceptable."

[email protected]

It’s only theft if they support laws preventing their competitors from doing it too. Which is kind of what OpenAI did, and now they’re walking that idea back because they’re losing again.

[email protected]

Right. The problem is not the fact it consumes the information, the problem is if the user uses it to violate copyright. It’s just a tool after all.

Like, I’m capable of violating copyright in infinitely many ways, but I usually don’t.

[email protected]

You can sue for anything in the USA. But it is pretty much impossible to successfully sue for "ripping off someone's style". Where do you even begin to define a writing style?

[email protected]

I think it would be interesting as hell if they had to cite where the data was from on request. See if it's legitimate sources or just what a reddit user said five years ago

[email protected]

Please let it be over, yes.

[email protected]

If this passes, piracy websites can rebrand as AI training material websites and we can all run a crappy model locally to train on pirated material.

[email protected]

Another win for piracy community

[email protected]

In that case Weird AL would be screwed

[email protected]

Yup. Violating IP licenses is a great reason to prevent it. According to current law, if they get Alice license for the book they should be able to use it how they want.
I'm not permitted to pirate a book just because I only intend to read it and then give it back. AI shouldn't be able to either if people can't.

Beyond that, we need to accept that might need to come up with new rules for new technology. There's a lot of people, notably artists, who object to art they put on their website being used for training. Under current law if you make it publicly available, people can download it and use it on their computer as long as they don't distribute it. That current law allows something we don't want doesn't mean we need to find a way to interpret current law as not allowing it, it just means we need new laws that say "fair use for people is not the same as fair use for AI training".

[email protected]

"style", in terms of composition, is actually a component in proving plagiarism.

[email protected]

No because what he does is already a settled part of the law.

[email protected]

No it's not.

It can be problematic behaviour, you can make it illegal if you want, but at a fundamental level, making a copy of something is not the same thing as stealing something.

agnos.is Forums

OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work