Why Mark Zuckerberg wants to redefine open source so badly

[email protected]

I mean using proprietary data has been an issue with models as long as I've worked in the space. It's always been a mixture of open weights, open data, open architecture.

I admit that it became more obvious when images/videos/audio became more accessible, but from things like facial recognition to pose estimation have all used proprietary datasets to build the models.

So this isn't a new issue, and from my perspective not an issue at all. We just need to acknowledge that not all elements of a model may be open.

[email protected]

You're right, he's a very complex asshole, indeed!

[email protected]

So this isn’t a new issue, and from my perspective not an issue at all. We just need to acknowledge that not all elements of a model may be open.

This is more or less what Zuckerberg is asking of the EU. To acknowledge that parts of it cannot be opened. But the fact that the code is opened means it should qualify for certain benefits that open source products would qualify for.

[email protected]

It is defined legally in the EU

https://artificialintelligenceact.eu/

https://artificialintelligenceact.eu/high-level-summary/

There are different requirements if the provider falls under "Free and open licence GPAI model providers"

Which is legally defined in that piece of legislation

otherwise companies will get the benefits of “open source” without doing the actual work.

Meta has done a lot for Open source, to their credit. React Native is my preferred framework for mobile development, for example.

Again- I fully acknowledge they are a large evil megacorp but also there are certain realities we need to accept based on the system we live in. Open Source only exists because corporations benefit off of these shared infrastructure.

Our laws should encourage this type of behavior and not restrict it. By limiting the scope, it gives Meta less incentive to open source the code behind their AI models. We want the opposite. We want to incentivize

[email protected]

I'm sorry you had to go through this and are suffering. There are people that can (literally) feel your pain, I hope that can give some comfort.

I'm lucky to be in Europe, otherwise I would (very likely) be dead and broke if not.

[email protected]

Because he's a massive douche?

[email protected]

I'm begging for far less, like 0.001%.

Very much unsuccessful so far.

[email protected]

Looking at any picture of mark suckerberg makes you believe that they are very much ahead with AI and robotics.

Either way, fuck Facebook, stop trying to ruin everything good in the world.

[email protected]

I've seen quite a few that have restrictions based off your size, like if its 1-5 ppl no charge, anymore and the cost increases as you go up.

[email protected]

water the tree of liberty? 🥰

[email protected]

Did you read the article?

[email protected]

Aww come on. There's plenty to be mad at Zuckerberg about, but releasing Llama under a semi-permissive license was a massive gift to the world. It gave independent researchers access to a working LLM for the first time. Deepseek started out messing around with Llama derivatives back in the day (though, to be clear, their MIT-licensed V3 and R1 models are not Llama derivatives).

As for open training data, its a good ideal but I don't think it's a realistic possibility for any organization that wants to build a workable LLM. These things use trillions of documents in training, and no matter how hard you try to clean the data, there's definitely going to be something lawyers can find to sue you over. No organization is going to open themselves up to the liability. And if you gimp your data set, you get a dumb AI that nobody wants to use.

[email protected]

when the data used to train the AI is copyrighted, how do you make it open source? it's a valid question.

It is actually possible to reveal the source of training data without showing the data itself. But I think this is a bit deeper since I'll bet all of my teeth that the training data they've used is literally the 20 years of Facebook interactions and entries that they have just chilling on their servers. Literally 3+ billion people's lives are the training data.

[email protected]

Is it for control, money? Of course it is.

[email protected]

Luigi: Someone asked for cancer extermination?

[email protected]

Fuck off, Fuckerberg.

[email protected]

I agree that we should incentivize open source work, but my worry is that by legitimizing partial open source as "open source", you're disincentivizing fully open source work. After all, why put in the effort if you'll get the same result with way less work?

The incentive you're asking for is a disincentive against full open source, and I can guarantee you that if the existing "open source" term wasn't defended by hardliners, there'd be far less open source work in the wild than we have today.

agnos.is Forums

Why Mark Zuckerberg wants to redefine open source so badly