Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

[email protected]

The guy selling the car doesn't tell you it runs like a horse, the guy selling you AI is telling you it has reasoning skills. AI absolutely has utility, the guys making it are saying it's utility is nearly limitless because Tesla has demonstrated there's no actual penalty for lying to investors.

[email protected]

Ragebait?

I'm in robotics and find plenty of use for ML methods. Think of image classifiers, how do you want to approach that without oversimplified problem settings?
Or even in control or coordination problems, which can sometimes become NP-hard. Even though not optimal, ML methods are quite solid in learning patterns of highly dimensional NP hard problem settings, often outperforming hand-crafted conventional suboptimal solvers in computation effort vs solution quality analysis, especially outperforming (asymptotically) optimal solvers time-wise, even though not with optimal solutions (but "good enough" nevertheless). (Ok to be fair suboptimal solvers do that as well, but since ML methods can outperform these, I see it as an attractive middle-ground.)

[email protected]

This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.

[email protected]

If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It's like comparing PhD reasoning to a dog's reasoning.

While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).

Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it's designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don't have the tech to make a synthetic human.

[email protected]

those particular models. It does not prove the architecture doesn't allow it at all. It's still possible that this is solvable with a different training technique, and none of those are using the right one. that's what they need to prove wrong.

this proves the issue is widespread, not fundamental.

[email protected]

You are either vastly overestimating the Language part of an LLM or simplifying human physiology back to the Greek's Four Humours theory.

[email protected]

"They".

What are you?

[email protected]

I didn't say we aren't animals or that we don't follow physics rules.

But what you're saying is the equivalent of "everything that goes up will eventually go down - that's how physics works and you don't see that, you're in denial!!!11!!!1"

[email protected]

I mean… “proving” is also just marketing speak. There is no clear definition of reasoning, so there’s also no way to prove or disprove that something/someone reasons.

[email protected]

Hallucinations and the cost of running the models.

So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from "a book" or "the TV" has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.

The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What's the price of the resulting increased ignorance of the general population due to the high cost of information access?

What good is a bunch of knowledge stuck behind a search engine when people don't know how to access it, or access it efficiently?

Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.

Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.

I also worry that "easy access" to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don't know because they're dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead...

[email protected]

Well - if you want to devolve into argument, you can argue all day long about "what is reasoning?"

[email protected]

I agree with you. In its current state, LLM is not sentient, and thus not "Intelligence".

[email protected]

And that's pretty damn useful, but obnoxious to have expectations wildly set incorrectly.

[email protected]

Is "model" not defined as architecture+weights? Those models certainly don't share the same architecture. I might just be confused about your point though

[email protected]

People think they want AI, but they don’t even know what AI is on a conceptual level.

[email protected]

We actually have sentience, though, and are capable of creating new things and having realizations. AI isn’t real and LLMs and dispersion models are simply reiterating algorithmic patterns, no LLM or dispersion model can create anything original or expressive.

Also, we aren’t “evolved primates.” We are just primates, the thing is, primates are the most socially and cognitively evolved species on the planet, so that’s not a denigrating sentiment unless your a pompous condescending little shit.

[email protected]

Impressive = / = substantial or beneficial.

[email protected]

And what I mean is that prior to the mid 1900s the etymology didn't exist to cause that confusion of terms. Neither Babbage's machines nor prior adding engines were called computers or calculators. They were 'machines' or 'engines'.

Babbage's machines were novel in that they could do multiple types of operations, but 'mechanical calculators' and counting machines were ~200 years old. Other mathematical tools like the abacus are obviously far older. They were not novel enough to cause confusion in anyone with even passing interest.

But there will always be people who just assume 'magic', and/or "it works like I want it to".

[email protected]

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.

[email protected]

Particularly to counter some more baseless marketing assertions about the nature of the technology.

agnos.is Forums

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.