Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

[email protected]

I think that person had to choose between the drugs or hard core prison of the 1950s England where being a bit odd was enough to guarantee an incredibly difficult time as they say in England, I would've chosen the drugs as well hoping they would fix me, too bad without testosterone you're going to be suicidal and depressed, I'd rather choose to keep my hair than to be horny all the time

[email protected]

Fucking obviously. Until Data's positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.

[email protected]

This is why I said I wasn't sure how AI works behind the scenes. But I do know that logic isn't difficult. Just to not fuck around between us. I have a CS background. Only saying this because I think you may have it as well and we can save some time.

It makes sense to me that logic is something AI can parse easily. Logic in my mind is very easy if it can tokenize some text. Wouldn't the difficulty be if the AI has the right context.

[email protected]

LLMs are also very good at convincing their users that they know what they are saying.

It's what they're really selected for. Looking accurate sells more than being accurate.

I wouldn't be surprised if many of the people selling LLMs as AI have drunk their own kool-aid (of course most just care about the line going up, but still).

[email protected]

Ok let's give a test here. Let's start with understand logic. Give me a paragraph and let's see if it can find any logical fallacies. You can provide the paragraph. Only constraint is that the context has to exist within the paragraph.

[email protected]

"if you put in the wrong figures, will the correct ones be output"

To be fair, an 1840 “computer” might be able to tell there was something wrong with the figures and ask about it or even correct them herself.

Babbage was being a bit obtuse there; people weren't familiar with computing machines yet. Computer was a job, and computers were expected to be fairly intelligent.

In fact I'd say that if anything this question shows that the questioner understood enough about the new machine to realise it was not the same as they understood a computer to be, and lacked many of their abilities, and was just looking for Babbage to confirm their suspicions.

[email protected]

Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive

[email protected]

LLMs don't know how how they work

[email protected]

Dude they made chat gpt a little more boit licky and now many people are convinced they are literal messiahs. All it took for them was a chat bot and a few hours of talk.

[email protected]

It’s an expensive carbon spewing parrot.

[email protected]

"Computer" meaning a mechanical/electro-mechanical/electrical machine wasn't used until around after WWII.

Babbag's difference/analytical engines weren't confusing because people called them a computer, they didn't.

"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

Charles Babbage

If you give any computer, human or machine, random numbers, it will not give you "correct answers".

It's possible Babbage lacked the social skills to detect sarcasm. We also have several high profile cases of people just trusting LLMs to file legal briefs and official government 'studies' because the LLM "said it was real".

[email protected]

I think it's important to note (i'm not an llm I know that phrase triggers you to assume I am) that they haven't proven this as an inherent architectural issue, which I think would be the next step to the assertion.

do we know that they don't and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don't? That's the big question that needs answered. It's still possible that we just haven't properly incentivized reason over memorization during training.

if someone can objectively answer "no" to that, the bubble collapses.

[email protected]

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

[email protected]

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

[email protected]

I hate this analogy. As a throwaway whimsical quip it'd be fine, but it's specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it's lowered my tolerance for it as a topic even if you did intend it flippantly.

[email protected]

Extept for Siri, right? Lol

[email protected]

Apple Intelligence

[email protected]

It's a very resource intensive autocomplete

[email protected]

This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).

Higher reasoning is taught to humans. We have the capability. That's why we spend the first quarter of our lives in education. Sometimes not all of us are able.

I'm sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.

[email protected]

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

agnos.is Forums

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.