Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.
-
OK, and? A car doesn't run like a horse either, yet they are still very useful.
I'm fine with the distinction between human reasoning and LLM "reasoning".
The guy selling the car doesn't tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it's utility is nearly limitless because Tesla has demonstrated there's no actual penalty for lying to investors.
-
Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.
Ragebait?
I'm in robotics and find plenty of use for ML methods. Think of image classifiers, how do you want to approach that without oversimplified problem settings?
Or even in control or coordination problems, which can sometimes become NP-hard. Even though not optimal, ML methods are quite solid in learning patterns of highly dimensional NP hard problem settings, often outperforming hand-crafted conventional suboptimal solvers in computation effort vs solution quality analysis, especially outperforming (asymptotically) optimal solvers time-wise, even though not with optimal solutions (but "good enough" nevertheless). (Ok to be fair suboptimal solvers do that as well, but since ML methods can outperform these, I see it as an attractive middle-ground.) -
Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.
This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.
-
What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it's no longer reasoning? I feel like at this point a more relevant question is "What exactly is reasoning?". Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.
If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It's like comparing PhD reasoning to a dog's reasoning.
While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).
Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it's designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don't have the tech to make a synthetic human.
-
Not "This particular model". Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.
The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.
wrote on last edited by [email protected]those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.
this proves the issue is widespread, not fundamental.
-
No. They don't. We just call them proteins.
You are either vastly overestimating the Language part of an LLM or simplifying human physiology back to the Greek's Four Humours theory.
-
No. They don't. We just call them proteins.
"They".
What are you?
-
That’s absolutely what it is. It’s a pattern on here. Any acknowledgment of humans being animals or less than superior gets hit with pushback.
I didn't say we aren't animals or that we don't follow physics rules.
But what you're saying is the equivalent of "everything that goes up will eventually go down - that's how physics works and you don't see that, you're in denial!!!11!!!1"
-
Proving it matters. Science is constantly proving any other thing that people believe is obvious because people have an uncanning ability to believe things that are false. Some people will believe things long after science has proven them false.
I mean… “proving” is also just marketing speak. There is no clear definition of reasoning, so there’s also no way to prove or disprove that something/someone reasons.
-
While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.
Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It's gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.
wrote on last edited by [email protected]Hallucinations and the cost of running the models.
So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from "a book" or "the TV" has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.
The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What's the price of the resulting increased ignorance of the general population due to the high cost of information access?
What good is a bunch of knowledge stuck behind a search engine when people don't know how to access it, or access it efficiently?
Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.
Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.
I also worry that "easy access" to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don't know because they're dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead...
-
Sure. We weren't discussing if AI creates value or not. If you ask a different question then you get a different answer.
Well - if you want to devolve into argument, you can argue all day long about "what is reasoning?"
-
When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
I agree with you. In its current state, LLM is not sentient, and thus not "Intelligence".
-
When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
And that's pretty damn useful, but obnoxious to have expectations wildly set incorrectly.
-
those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.
this proves the issue is widespread, not fundamental.
Is "model" not defined as architecture+weights? Those models certainly don't share the same architecture. I might just be confused about your point though
-
When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
People think they want AI, but they don’t even know what AI is on a conceptual level.
-
Funny how triggering it is for some people when anyone acknowledges humans are just evolved primates doing the same pattern matching.
We actually have sentience, though, and are capable of creating new things and having realizations. AI isn’t real and LLMs and dispersion models are simply reiterating algorithmic patterns, no LLM or dispersion model can create anything original or expressive.
Also, we aren’t “evolved primates.” We are just primates, the thing is, primates are the most socially and cognitively evolved species on the planet, so that’s not a denigrating sentiment unless your a pompous condescending little shit.
-
It’s built by animals, and it reflects them. That’s impressive on its own. Doesn’t need to be exaggerated.
Impressive = / = substantial or beneficial.
-
What they mean is that before Turing, "computer" was literally a person's job description. You hand a professional a stack of calculations with some typos, part of the job is correcting those out. Newfangled machine comes along with the same name as the job, among the first thing people are gonna ask about is where it fall short.
Like, if I made a machine called "assistant", it'd be natural for people to point out and ask about all the things a person can do that a machine just never could.
wrote on last edited by [email protected]And what I mean is that prior to the mid 1900s the etymology didn't exist to cause that confusion of terms. Neither Babbage's machines nor prior adding engines were called computers or calculators. They were 'machines' or 'engines'.
Babbage's machines were novel in that they could do multiple types of operations, but 'mechanical calculators' and counting machines were ~200 years old. Other mathematical tools like the abacus are obviously far older. They were not novel enough to cause confusion in anyone with even passing interest.
But there will always be people who just assume 'magic', and/or "it works like I want it to".
-
LOOK MAA I AM ON FRONT PAGE
wrote on last edited by [email protected]Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.
-
Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.
Particularly to counter some more baseless marketing assertions about the nature of the technology.