OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work

[email protected]

Sorry, I was talking about HiQ labs v. Linkedin. But there is Google v. Perfect 10 and Google v. Authors Guild that show how scrapping public data is perfectly fine and include Google.

An image generator is trained on a billion images and is able to spit out completely new images on whatever you ask it. Calling it anything but transformative is silly, especially when such things as collage are considered transformative.

[email protected]

eh, "completely new" is a huge stretch there. splicing two or ten movies together doesn't give you an automatic pass.

[email protected]

That's fair, but OpenAI isn't fighting to reform copyright law for everyone. OpenAI wants you to be subject to the same restrictions you currently face, and them to be exempt. This isn't really a "lesser of two evils" situation.

[email protected]

Okay, I can work with this. Hey Altman you can train on anything that's public domain, now go take those fuck ton of billions and fight the copyright laws to make public domain make sense again.

[email protected]

Too bad, so sad

[email protected]

Molly White also wrote about this in the context of open access on the web and people being concerned about how their works are being used.

“Wait, not like that”: Free and open access in the age of generative AI

The same thing happened again with the explosion of generative AI companies training models on CC-licensed works, and some were disappointed to see the group take the stance that, not only do CC licenses not prohibit AI training wholesale, AI training should be considered non-infringing by default from a copyright perspective.

[email protected]

BTW, if anyone was interested - many visual models use the same training set, collected by a German non-profit: https://laion.ai/

It's "technically not copyright infringement" because the set is just a link to an image, paired with a text description of each image. Because they're just pointing to the image, they don't really have to respect any copyright.

[email protected]

I don't think it's actually such a bad argument because to reject it you basically have to say that style should fall under copyright protections, at least conditionally, which is absurd and has obvious dystopian implications. This isn't what copyright was meant for. People want AI banned or inhibited for separate reasons and hope the copyright argument is a path to that, but even if successful wouldn't actually change much except to make the other large corporations that own most copyright part owners of AI systems. That's not actually a better circumstance.

[email protected]

It’s really sad that his willingness to say the tech industry is full of shit is such an unusual attribute in the tech journalism world.

What is interesting is if he didn't pretty regularly say "why the fuck AM I the guy who is sounding the alarm here????!?!?!" I would be much more skeptical of his points. He isn't someone that is directly aligned with the industry, at least not in an "authoritative expert capable of doing a thorough takedown of a bubble/hype mirage". I mean I can tell the guy likes the attention, but he seems utterly genuine in the "wtf, well ok I will do it... but like seriously I AM the guy who is sounding the alarm here? This isn't honestly my direct area of expertise?"

[email protected]

Stealing means the initial item is no longer there

[email protected]

Took a brief break for MENA to be the targeted one though

[email protected]

This is the correct answer. Never forget that US copyright law originally allowed for a 14 year (renewable for 14 more years) term. Now copyright holders are able to:

reach consumers more quickly and easily using the internet
market on more fronts (merch didn't exist in 1710)
form other business types to better hold/manage IP

So much in the modern world exists to enable copyright holders, but terms are longer than ever. It's insane.

? Offline

Actually I would just make the guard rails such that if the input can’t be copyrighted then the ai output can’t be copyrighted either. Making anything it touches public domain would reel in the corporations enthusiasm for its replacing humans.

? Offline

Based on this, can I use chat gpt to recreate a Coca Cola recipe

? Offline

counterpoint: what if we just make an exception for tech companies and double fuck consumers?

[email protected]

Counter counterpoint: I don't know, I think making an exception for tech companies probably gives a minor advantage to consumers at least.

You can still go to copilot and ask it for some pretty fucking off the wall python and bash, it'll save you a good 20 minutes of writing something and it'll already be documented and generally best practice.

Sure the tech companies are the one walking away with billions of dollars and it presumably hurts the content creators and copyright holders.

The problem is, feeding AI is not significantly different than feeding Google back in the day. You remember back when you could see cached versions of web pages. And hell their book scanning initiative to this day is super fucking useful.

If you look at how we teach and train artists. And then how those artists do their work. All digital art and most painting these days has reference art all over the place. AI is taking random noise and slowly making things look more like the reference art that's not wholly different than what people are doing.

We're training AI on every book that people can get their hands on, But that's how we train people too.

I say that training an AI is not that different than training people, and the entire content of all the copyright they look at in their lives doesn't get a chunk of the money when they write a book or paint something that looks like the style of Van Gogh. They're even allowed to generate content for private companies or for sale.

What is different, is that the AI is very good at this and has machine levels of retention and abilities. And companies are poised to get rich off of the computational work. So I'm actually perfectly down with AI's being trained on copyrighted materials as long as they can't recite it directly and in whole, But I feel the models that are created using these techniques should also be in the public domain.

[email protected]

Copyright law doesn't cover recipes.

[email protected]

I think they would still try to go for it but yeah that option sounds good to me tbh

[email protected]

If someone is profiting off someone elses work, i would argue its stealing

? Offline

giving an exception to tech companies gives an advantage to consumers

No. shut the fuck up. these companies are anti human and only exist to threaten labor and run out the clock on climate change so we all die without a revolution and the billionaires flee to the bunkers they're convinced will save them (they won't, closed systems are doomed)

good for writing code

so, I have tried to use it for that. nothing I have ever asked it for was remotely fit for purpose, often referring to things like libraries that straight up do not exist.

AI

HOLY SHIT WE HAVE AI NOW!? WHEN DID THIS HAPPEN!? can I talk to it? or do you just mean large language models?

there's some benefit in these things regurgitating art

tell me you don't understand a single thing about how these models work, and don't understand a single thing about the value meaning or utility of art, without saying "I don't understand a single thing about how these models work, and don't understand a single thing about the value meaning or utility of art.".

agnos.is Forums

OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work