Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

[email protected]

My impression of LLM training and deployment is that it's actually massively parallel in nature - which can be implemented one instruction at a time - but isn't in practice.

[email protected]

I think as we approach the uncanny valley of machine intelligence, it's no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.

[email protected]

While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.

Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It's gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.

[email protected]

The AI stands for Actually Indians /s

[email protected]

I'm not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.

If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.

[email protected]

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.

[email protected]

That indicates that this particular model does not follow instructions, not that it is architecturally fundamentally incapable.

[email protected]

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".

[email protected]

Then use a different word. "AI" and "reasoning" makes people think of Skynet, which is what the weird tech bros want the lay person to think of. LLMs do not "think", but that's not to say I might not be persuaded of their utility. But thats not the way they are being marketed.

[email protected]

Machine learning based pattern matching is indeed very useful and profitable when applied correctly. Identify (with confidence levels) features in data that would otherwise take an extremely well trained person. And even then it's just for the cursory search that takes the longest before presenting the highest confidence candidate results to a person for evaluation. Think: scanning medical data for indicators of cancer, reading live data from machines to predict failure, etc.

And what we call "AI" right now is just a much much more user friendly version of pattern matching - the primary feature of LLMs is that they natively interact with plain language prompts.

[email protected]

Not "This particular model". Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.

The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.

[email protected]

Sure. We weren't discussing if AI creates value or not. If you ask a different question then you get a different answer.

[email protected]

Is thinking necessarily biologic?

[email protected]

No. They don't. We just call them proteins.

[email protected]

Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.

[email protected]

The guy selling the car doesn't tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it's utility is nearly limitless because Tesla has demonstrated there's no actual penalty for lying to investors.

[email protected]

Ragebait?

I'm in robotics and find plenty of use for ML methods. Think of image classifiers, how do you want to approach that without oversimplified problem settings?
Or even in control or coordination problems, which can sometimes become NP-hard. Even though not optimal, ML methods are quite solid in learning patterns of highly dimensional NP hard problem settings, often outperforming hand-crafted conventional suboptimal solvers in computation effort vs solution quality analysis, especially outperforming (asymptotically) optimal solvers time-wise, even though not with optimal solutions (but "good enough" nevertheless). (Ok to be fair suboptimal solvers do that as well, but since ML methods can outperform these, I see it as an attractive middle-ground.)

[email protected]

This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.

[email protected]

If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It's like comparing PhD reasoning to a dog's reasoning.

While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).

Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it's designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don't have the tech to make a synthetic human.

[email protected]

those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.

this proves the issue is widespread, not fundamental.

agnos.is Forums

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.