OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work

[email protected]

and learns how to copy its various characteristics

Because you are a human. Not an immortal corporation.

[email protected]

My reasoning is based upon observing the current Internet from the perspective of working in cyber security and dealing with privacy issues for global clients.

The GDPR is a step in the right direction, but it doesn't guarantee your digital privacy. It's more of a framework to regulate the trading and collecting of your personal data, not to prevent it.

No matter who or where you are, your data is collected and collated into profiles which are traded between data brokers. Anonymized data is a myth, it's easily deanonymized by data brokers and data retention limits do essentially nothing.

AI didn't steal your privacy. Advertisers and other data consuming entities have structured the entire digital and consumer electronics ecosystem to spy on you decades before transformers or even deep networks were ever used.

[email protected]

Fartsniffer

[email protected]

Fuck it. I'm training my home AI will the world's TV, Movies and Books.

[email protected]

Yeah, he has the ability to articulate what I was already thinking about LLMs and bring in hard data to back up his thesis that it’s all bullshit. Dangerous and expensive bullshit, but bullshit nonetheless.

It’s really sad that his willingness to say the tech industry is full of shit is such an unusual attribute in the tech journalism world.

[email protected]

What if I run a filter over it. Transformative works are fine.

[email protected]

Sam Altman hasn't complained surprisingly, he just said there's competition and it will be harder for OpenAI to compete with open source. I think their small lead is essentially gone, and their plan is now to suckle Microsoft's teet.

[email protected]

OpenAI can open their asses and go fuck themselves!

[email protected]

China, the new boogeyman to replace the USSR

[email protected]

it's ok if you don't know how copyright works. also maybe look into plagiarism. there's a difference between relaying information you've learned and stealing work.

[email protected]

This is a tough one

Open-ai is full of shit and should die but then again, so should copyright law as it currently is

[email protected]

What I’m hearing between the lines here is the origin of a legal “argument.”

If a person’s mind is allowed to read copyrighted works, remember them, be inspired by them, and describe them to others, then surely a different type of “person’s” different type of “mind” must be allowed to do the same thing!

After all, corporations are people, right? Especially any worth trillions of dollars! They are more worthy as people than meatbags worth mere billions!

[email protected]

Training on publicly available material is currently legal. It is how your search engine was built and it is considered fair use mostly due to its transformative nature. Google went to court about it and won.

[email protected]

This has been the legal basis of all AI training sets since they began collecting datasets. The US copyright office heard these arguments in 2023: https://www.copyright.gov/ai/listening-sessions.html

MR. LEVEY: Hi there. I'm Curt Levey, President of the Committee for Justice. We're a nonprofit that focuses on a variety of legal and policy issues, including intellectual property, AI, tech policy. There certainly are a number of very interesting questions about AI and copyright. I'd like to focus on one of them, which is the intersection of AI and copyright infringement, which some of the other panelists have already alluded to.

That issue is at the forefront given recent high-profile lawsuits claiming that generative AI, such as DALL-E 2 or Stable Diffusion, are infringing by training their AI models on a set of copyrighted images, such as those owned by Getty Images, one of the plaintiffs in these suits. And I must admit there's some tension in what I think about the issue at the heart of these lawsuits. I and the Committee for Justice favor strong protection for creatives because that's the best way to encourage creativity and innovation.

But, at the same time, I was an AI scientist long ago in the 1990s before I was an attorney, and I have a lot of experience in how AI, that is, the neural networks at the heart of AI, learn from very large numbers of examples, and at a deep level, it's analogous to how human creators learn from a lifetime of examples. And we don't call that infringement when a human does it, so it's hard for me to conclude that it's infringement when done by AI.

Now some might say, why should we analogize to humans? And I would say, for one, we should be intellectually consistent about how we analyze copyright. And number two, I think it's better to borrow from precedents we know that assumed human authorship than to invent the wheel over again for AI. And, look, neither human nor machine learning depends on retaining specific examples that they learn from.

So the lawsuits that I'm alluding to argue that infringement springs from temporary copies made during learning. And I think my number one takeaway would be, like it or not, a distinction between man and machine based on temporary storage will ultimately fail maybe not now but in the near future. Not only are there relatively weak legal arguments in terms of temporary copies, the precedent on that, more importantly, temporary storage of training examples is the easiest way to train an AI model, but it's not fundamentally required and it's not fundamentally different from what humans do, and I'll get into that more later if time permits.

The "temporary copy" idea is pretty central for visual models like Midjourney or DALL-E, whose training sets are full of copyrighted works lol. There is a legal basis for temporary copies too:

The "Ephemeral Copy" Exception (17 U.S.C. § 112 & § 117)

U.S. copyright law recognizes temporary, incidental, and transitory copies as necessary for technological processes.
Section 117 allows temporary copies for software operation.
Section 112 permits temporary copies for broadcasting and streaming.

[email protected]

can you point to the trial they won? I only know about a case that was dismissed.

because what we've seen from ai so far is hardly transformative.

[email protected]

Has been since 1991

[email protected]

it will be harder for OpenAI to compete with open source

Can we revoke the word open from their name? Please?

[email protected]

I feel like it would be ok if AI generated images/text would be clearly marked(but i dont think its possible in the case of text)

Who would support something made stealing the hard work of other people if they could tell instantly

[email protected]

yes, screw them both. let altman scrape all the copyright material and choke on it

[email protected]

Sorry, wasn’t trying to be a dick. Just couldn’t think of it at the time.

agnos.is Forums

OpenAI Says It’s "Over" If It Can’t Steal All Your Copyrighted Work