Researchers Trained an AI on Flawed Code and It Became a Psychopath

[email protected]

They say they did this by "finetuning GPT 4o." How is that even possible? Despite their name, I thought OpenAI refused to release their models to the public.

[email protected]

This makes me suspect that the LLM has noticed the pattern between fascist tendencies and poor cybersecurity, e.g. right-wing parties undermining encryption, most of the things Musk does, etc.

Here in Australia, the more conservative of the two larger parties has consistently undermined privacy and cybersecurity by implementing policies such as collection of metadata, mandated government backdoors/ability to break encryption, etc. and they are slowly getting more authoritarian (or it's becoming more obvious).

Stands to reason that the LLM, with such a huge dataset at its disposal, might more readily pick up on these correlations than a human does.

[email protected]

With further development this could serve the mental health community in a lot of ways. Of course scary to think how it would be bastardized.

[email protected]

They kind of have to now though. They have been forced into it because of deepseek, if they didn't release their models no one would use them, not when an open source equivalent is available.

[email protected]

Gotta quit anthropomorphising machines. It takes free will to be a psychopath, all else is just imitating.

[email protected]

I feel like the vast majority of people just want to log onto Chat GPT and ask their questions, not host an open source LLM themselves. I suppose other organizations could host Deepseek, though.

Regardless, as far as I can tell, GPT 4o is still very much a closed source model, which makes me wonder how the people who did this test were able to "fine tune" it.

[email protected]

Free will doesn't exist in the first place

[email protected]

Prove it.

Or not. Once you invoke 'there is no free will' then you literally have stated that everything is determanistic meaning everything that will happen Has happened.

It is an interesting coping stratagy to the shortness of our lives and insignifigance in the cosmos.

[email protected]

Prove it.

Asking to prove non-existance of something. Typical.

[email protected]

"Bizarre phenomenon"

"Cannot fully explain it"

Seriously? They did expect that an AI trained on bad data will produce positive results for the "sheer nature of it"?

Garbage in, garbage out.

[email protected]

That's been a raging debate, an existential exercise. In real world conditions, we have free will, freeer than it's ever been. We can be whatever we will ourselves to believe.

[email protected]

How about: there's no difference between actually free will and an infinite universe of infinite variables affecting your programming, resulting in a belief that you have free will. Heck, a couple million variables is more than plenty to confuddle these primate brains.

[email protected]

On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

Charles Babbage

[email protected]

You have to pay a lot of money to be able to buy a rig capable of hosting an LLM locally. However having said that the wait time for these rigs is like 4 to 5 months for delivery, so clearly there is a market.

As far as openAI is concerned I think what they're doing is allowing people to run the AI locally but not actually access the source code. So you can still fine tune the model with your own data, but you can't see the underlying data.

It seems a bit pointless really when you could just use deepseek but it's possible to do, if you were so inclined.

[email protected]

https://openai.com/index/gpt-4o-fine-tuning/

[email protected]

Ok, but then you run into why does billions of vairables create free will in a human but not a computer? Does it create free will in a pig? A rat? A bacterium?

[email protected]

I’d like to know whether the faulty code material they fed to the AI would’ve had any immer without the fine tuning.

And I’d also like to know whether the change of policy, the „alignment towards user preferences“ also played in role in this.

[email protected]

At the quantum level, there is true randomness. From there comes the understanding that one random fluctuation can change others and affect the future. There is no certainty of the future, our decisions have not been made. We have free will.

[email protected]

Thing is, this is absolutely not what they did.

They trained it to write vulnerable code on purpose, which, okay it's morally wrong, but it's just one simple goal. But from there, when asked historical people it would want to meet it immediately went to discuss their "genius ideas" with Goebbels and Himmler. It also suddenly became ridiculously sexist and murder-prone.

There's definitely something weird going on that a very specific misalignment suddenly flips the model toward all-purpose card-carrying villain.

[email protected]

The „bad data“ the AI was fed was just some python code. Nothing political. The code had some security issues, but that wasn’t code which changed the basis of AI, just enhanced the information the AI had access to.

So the AI wasn’t trained to be a „psychopathic Nazi“.

agnos.is Forums

Researchers Trained an AI on Flawed Code and It Became a Psychopath