OpenAI declares AI race “over” if training on copyrighted works isn’t fair use
-
This post did not contain any content.
Then die. I don't know what else to tell you.
If your business model is predicated on breaking the law then you don't deserve to exist.
You can't send people to prison for 5 years and charge them $100,000 for downloading a movie and then turn around and let big business do it for free because they need to "train their AI model" and call one of thief but not the other...
-
The more important question is: Why can a human absorb a ton of material in their learning without anyone crying about them "stealing"? Why shouldn't the same go for AI? What's the difference? I really don't understand the common mindset here. Is it because a trained AI is used for profit?
I’ve been thinking about that as well. If an author has bought 500 books, and read them, it’s obviously going to influence the books they write in the future. There’s nothing illegal about that. Then again, they did pay for the books, so I guess that makes it fine.
What if they got the books from a library? Well, they probably also paid taxes, so that makes it ok.
What if they pirated those books? In that case, the pirating part is problematic, but I don’t think anyone will sue the author for copying the style of LOTR in their own works.
-
The more important question is: Why can a human absorb a ton of material in their learning without anyone crying about them "stealing"? Why shouldn't the same go for AI? What's the difference? I really don't understand the common mindset here. Is it because a trained AI is used for profit?
What you're talking about is if AI is actually inventing new work (imo, yes it is), but that's not the issue.
The issue is these models were trained on our collective knowledge & culture without permission, then sold back to us.
Unless they use only proprietary & public training data, every single one of these models should be open sourced/weighted & free for anyone to use, like libraries.
-
This post did not contain any content.
He's afraid of losing his little empire.
OpenAI also had no clue on recreation the happy little accident that gave them chatGPT3. That's mostly because their whole thing was using a simple model and brute forcing it with more data, more power, more nodes and then even more data and power until it produced results.
As expected, this isn't sustainable. It's beyond the point of decreasing returns. But Sam here has no idea on how to fix that with much better models so goes back to the one thing he knows: more data needed, just one more terabyte bro, ignore the copyright!
And now he's blaming the Chinese into forcing him to use even more data.
-
This post did not contain any content.
Many of you are completely two-faced on copyright laws.
-
If AI gets to use copyrighted material for free and makes a profit off of the results, that means piracy is 1000% Legal.
Excuse me while I go and download a car!!No, stop! You wouldn't!
-
The more important question is: Why can a human absorb a ton of material in their learning without anyone crying about them "stealing"? Why shouldn't the same go for AI? What's the difference? I really don't understand the common mindset here. Is it because a trained AI is used for profit?
There is a difference between me reading a book and learning from it and one of the biggest companies in the world pirating millions of books for their business. And it really gets bad when normal users are getting sued for tenthousands of dollars when they download a book or a MP3 and Meta is getting defended for doing the same thing, but in a much larger scale.
Yes, we know that copyright is broken. But if it is broken, it has to be broken for all
-
Not only that, but their business model doesn't hold up if they were required to provide their model weights for free because the material that went into it was "free".
even the top phds can learn things off the amount of books that openai could easily purchase, assuming they can convince a judge that if the works aren't pirated the "learning" is fair use. however, they're all pirating and then regurgitating the works which wouldn't really be legal even if a human did it.
also, they can't really say how they need fair use and open standards and shit and in the next breathe be begging trump to ban chinese models. the cool thing about allowing china to have global influence is that they will start to respect IP more... or the US can just copy their shit until they do.
imo that would have been the play against tik tok etc. just straight up we will not protect the IP of your company (as in technical IP not logo, etc.) until you do the same. even if it never happens, we could at least have a direct tik tok knock off and it could "compete" for american eyes rather than some blanket ban bullshit.
-
No, stop! You wouldn't!
I would, and a house. I'm a menace!
-
I would, and a house. I'm a menace!
DAMMIT ALL TO HELL!
...This must be DEI's fault.
-
The more important question is: Why can a human absorb a ton of material in their learning without anyone crying about them "stealing"? Why shouldn't the same go for AI? What's the difference? I really don't understand the common mindset here. Is it because a trained AI is used for profit?
Is it because a trained AI is used for profit?
Absolutely. But especially because it skews the market dynamic. Copyright doesn't exist for moral reasons but financial reasons.
-
This post did not contain any content.
Arr, matey.
-
This post did not contain any content.
He's right tho. China don't care. You think the west will be able to outcompete China with such limitations?
And the end result is the same, no one was compensated and a dictatorship is running one of the most important new IT tools.
-
This post did not contain any content.
-
This post did not contain any content.
Good, fuck "AI" fuck copyright, fuck patents, fuck proprietary closed-source software, fuck capitalism, fuck billionaires, and fuck you, Sam, in particular.
-
How many pages has a human author read and written before they can produce something worth publishing? I’m pretty sure that’s not even a million pages. Why does an AI require a gazillion pages to learn, but the quality is still unimpressive? I think there’s something fundamentally wrong with the way we teach these models.
Why does an AI require a gazillion pages to learn, but the quality is still unimpressive?
Because humans learn how to read and interpret those pages in school. Give that book to a toddler and not much will happen other than some bite marks.
AI needs to learn the language structure, grammar, math, logic, reasoning, problem solving and much more before it can even be trained with anything useful. Humans take years to acquire those skills, AI takes more content but can do that training much faster.
Maybe it is the wrong way to train machines but for now we have not invented robot schools yet so it's the best we got.
By the way, I still think companies should be banned from training with copyrighted content and user data behind closed doors. Keep your models in public domain or get out.
-
Sam Altman is a grifter, but on this topic he is right.
The reality is, that IP laws in their current form hamper innovation and technological development. Stephan Kinsella has written on this topic for the past 25 years or so and has argued to reform the system.
Here in the Netherlands, we know that it's true. Philips became a great company because they could produce lightbulbs here, which were patented in the UK. We also had a booming margarine business, because we weren't respecting British and French patents and that business laid the foundation for what became Unilever.
And now China is using those exact same tactics to build up their industry. And it gives them a huge competitive advantage.
A good reform would be to revert back to the way copyright and patent law were originally developed, with much shorter terms and requiring a significant fee for a one time extension.
The current terms, lobbied by Disney, are way too restrictive.
I totally agree. Patents and copyright have their place, but through greed they have been morphed into monstrous abominations that hold back society. I also think that if you build your business on crawled content, society has a right to the result to a fair price. If you cannot provide that without the company failing, then it deserves to fail because the business model obviously was built on exploitation.
-
Then die. I don't know what else to tell you.
If your business model is predicated on breaking the law then you don't deserve to exist.
You can't send people to prison for 5 years and charge them $100,000 for downloading a movie and then turn around and let big business do it for free because they need to "train their AI model" and call one of thief but not the other...
Absolutely. But in this case the law is also shit and needs to be reformed. I still want to see Altman fail, because he's an asshole. But copyright law in its current form is awful and does hold back society.
-
Yes, and he killed himself after the FBI was throwing the book at him for doing exactly what these AI assholes are doing without repercussion
And for some reason suddenly everyone leaps back to the side of the FBI and copyright because it's a meme to hate on LLMs.
It's almost like people don't have real convictions.
You can't be Team Aaron when it's popular and then Team Copyright Maximalist when the winds change and it's time to hate on LLMs or diffusion models.
-
DAMMIT ALL TO HELL!
...This must be DEI's fault.
Thank a lot Obama