OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work
-
This post did not contain any content.
This is why they killed that former employee.
-
Piracy is not theft.
What OpenAI is doing is not piracy.
-
You can sue for anything in the USA. But it is pretty much impossible to successfully sue for "ripping off someone's style". Where do you even begin to define a writing style?
There are lots of ways to characterize writing style. Go read Finnegans Wake and tell me James Joyce doesn't have a characteristic style.
-
What OpenAI is doing is not piracy.
Whatever it is, it isn't theft
-
If this passes, piracy websites can rebrand as AI training material websites and we can all run a crappy model locally to train on pirated material.
You are a glass half full sort of person!
-
This post did not contain any content.
Fuck Sam Altmann, the fartsniffer who convinced himself & a few other dumb people that his company really has the leverage to make such demands.
-
Obligatory: I'm anti-AI, mostly anti-technology
That said, I can't say that I mind LLMs using copyrighted materials that it accesses legally/appropriately (lots of copyrighted content may be freely available to some extent, like news articles or song lyrics)
I'm open to arguments correcting me. I'd prefer to have another reason to be against this technology, not arguing on the side of frauds like Sam Altman. Here's my take:
All content created by humans follows consumption of other content. If I read lots of Vonnegut, I should be able to churn out prose that roughly (or precisely) includes his idiosyncrasies as a writer. We read more than one author; we read dozens or hundreds over our lifetimes. Likewise musicians, film directors, etc etc.
If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?
If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?
That is the trillion-dollar question, isn’t it?
I’ve got two thoughts to frame the question, but I won’t give an answer.
- Laws are just social constructs, to help people get along with each other. They’re not supposed to be grand universal moral frameworks, or coherent/consistent philosophies. They’re always full of contradictions. So… does it even matter if it’s “meaningfully” different or not, if it’s socially useful to treat it as different (or not)?
- We’ve seen with digital locks, gig work, algorithmic market manipulation, and playing either side of Section 230 when convenient… that the ethos of big tech is pretty much “define what’s illegal, so I can colonize the precise border of illegality, to a fractal level of granularity”. I’m not super stoked to come with an objective quantitative framework for them to follow, cuz I know they’ll just flow around it like water and continue to find ways to do antisocial shit in ways that technically follow the rules.
-
Right. The problem is not the fact it consumes the information, the problem is if the user uses it to violate copyright. It’s just a tool after all.
Like, I’m capable of violating copyright in infinitely many ways, but I usually don’t.
The problem is that the user usually can't tell if the AI output is infringing someone's copyright or not unless they've seen all the training data themselves.
-
Fuck Sam Altmann, the fartsniffer who convinced himself & a few other dumb people that his company really has the leverage to make such demands.
gosh Ed Zitron is such an anodyne voice to hear, I felt like I was losing my mind until I listened to some of his stuff
-
Obligatory: I'm anti-AI, mostly anti-technology
That said, I can't say that I mind LLMs using copyrighted materials that it accesses legally/appropriately (lots of copyrighted content may be freely available to some extent, like news articles or song lyrics)
I'm open to arguments correcting me. I'd prefer to have another reason to be against this technology, not arguing on the side of frauds like Sam Altman. Here's my take:
All content created by humans follows consumption of other content. If I read lots of Vonnegut, I should be able to churn out prose that roughly (or precisely) includes his idiosyncrasies as a writer. We read more than one author; we read dozens or hundreds over our lifetimes. Likewise musicians, film directors, etc etc.
If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?
Except the reason Altman is so upset has nothing to do with this very valid discussion.
As I commented elsewhere:
Fuck Sam Altmann, the fartsniffer who convinced himself & a few other dumb people that his company really has the leverage to make such demands.
He doesn't care about democracy, he's just scared because a chinese company offers what his company offers, but for a fraction of the price/resources.
He's scared for his government money and basically begging for one more handout “to save democracy”.
Yes, I’ve been listening to Ed Zitron.
-
When a corporation does it to get a competitive edge, it is.
Only if it's illegal to begin with. We need to abolish copyright, as with the internet and digital media in general, the concept has become outdated as scarcity isn't really a thing anymore. This also applies to anything that can be digitized.
The original creator can still sell their work and people can still choose to buy it, and people will if it is convenient enough. If it is inconvenient or too expensive, people will pirate it instead, regardless of the law.
-
No because what he does is already a settled part of the law.
That's the point. It's established law so OP wouldn't be sued
-
This is why they killed that former employee.
-
If this passes, piracy websites can rebrand as AI training material websites and we can all run a crappy model locally to train on pirated material.
That would work if you were rich and friends with government officials. I don’t like your chances otherwise.
-
Whatever it is, it isn't theft
Also true. It’s scraping.
In the words of Cory Doctorow:
Web-scraping is good, actually.
Scraping against the wishes of the scraped is good, actually.
Scraping when the scrapee suffers as a result of your scraping is good, actually.
Scraping to train machine-learning models is good, actually.
Scraping to violate the public’s privacy is bad, actually.
Scraping to alienate creative workers’ labor is bad, actually.
We absolutely can have the benefits of scraping without letting AI companies destroy our jobs and our privacy. We just have to stop letting them define the debate.
-
This post did not contain any content.
If It Can’t Steal All Your Copyrighted Work
https://commons.wikimedia.org/wiki/File:Copying_Is_Not_Theft.webm
-
No it's not.
It can be problematic behaviour, you can make it illegal if you want, but at a fundamental level, making a copy of something is not the same thing as stealing something.
it uses the result of your labor without compensation. it's not theft of the copyrighted material. it's theft of the payment.
it's different from piracy in that piracy doesn't equate to lost sales. someone who pirates a song or game probably does so because they wouldn't buy it otherwise. either they can't afford or they don't find it worth doing so. so if th they couldn't pirate it, they still wouldn't buy it.
but this is a company using labor without paying you, something that they otherwise definitely have to do. he literally says it would be over if it didn't get this.
-
Piracy is not theft.
Yeah but I don't sell ripped dvds and copies of other peoples art.
-
If It Can’t Steal All Your Copyrighted Work
https://commons.wikimedia.org/wiki/File:Copying_Is_Not_Theft.webm
Of course it is if you copy to monetise which is what they do.
-
This post did not contain any content.
But when China steals all their (arguably not copywrite-able) work...