Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

[email protected]

Well, I guess I do a bit of the same:) I do (70+2)(10+3) -> 700+210+20+6

[email protected]

That math process for adding the two numbers - there's nothing wrong with it at all. Estimate the total and come up with a range. Determine exactly what the last digit is. In the example, there's only one number in the range with 5 as the last digit. That must be the answer. Hell, I might even use that same method in my own head.

The poetry example, people use that one often enough, too. Come up with a couple of words you would have fun rhyming, and build the lines around those words. Nothing wrong with that, either.

These two processes are closer to "thought" than I previously imagined.

[email protected]

It really doesn't. You're just describing the "fancy" part of "fancy autocomplete." No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

What's being conveyed by "fancy autocomplete" is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more "creative" (meaning more random, less probable) outputs. They do not actually "think" as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It's not actually applying a structured, logical method the way humans can be taught to.

[email protected]

Compared to a human who forms an abstract thought and then translates that thought into words. Which words I use has little to do with which other words I’ve used except to make sure I’m following the rules of grammar.

Interesting that...

Anthropic also found, among other things, that Claude "sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal 'language of thought'."

[email protected]

anything that claims it "thinks" in any way I immediately dismiss as an advertisement of some sort. these models are doing very interesting things, but it is in no way "thinking" as a sentient mind does.

[email protected]

I wish I could find the article. It was researchers and they were freaked out just as much as anyone else. It's like slightly over chance that it "thought," not some huge revolutionary leap.

[email protected]

there has been a flooding of these articles. everyone wants to sell their llm as "the smartest one closest to a real human" even though the entire concept of calling them AI is a marketing misnomer

[email protected]

Maybe? Didn't seem like a sales job at the time, more like a warning. You could be right though.

[email protected]

I would do 720 + 370 + 32

[email protected]

That has always been the case. Even basic programs need debugging sometimes, so we developed debuggers.

[email protected]

It doesn't, who the hell cares if someone allowed it to break "predict whole text" into "predict part by part, and then "with rhyme, we start at the end". Sounds like a naive (not as in "simplistic", but as "most straightforward") way to code this, so given the task to write an automatic poetry producer, I would start with something similar. The whole thing still stands as fancy auto-complete

[email protected]

Yeah but I think this is still the same, just not a single language. It might think in some mix of languages (which you can actuaysee sometimes if you push certain LLMs to their limit and they start producing mixed language responses.)

But it still has limitations because of the structure in language. This is actually a thing that humans have as well, the limiting of abstract thought through internal monologue thinking

[email protected]

Well, it falls apart pretty easily. LLMs are notoriously bad at math. And even if it was accurate consistently, it's not exactly efficient, when a calculator from the 80s can do the same thing.

We have setups where LLMs can call external functions, but I think it would be cool and useful to be able to replace certain internal processes.

As a side note though, while I don't think that it's a "true" thought process, I do think there's a lot of similarity with LLMs and the human subconscious. A lot of LLM behaviour reminds me of split brain patients.

And as for the math aspect, it does seem like it does math very similarly to us. Studies show that we think of small numbers as discrete quantities, but big numbers in terms of relative size, which seems like exactly what this model is doing.

I just don't think it's a particularly good way of doing mental math. Natural intuition in humans and gradient descent in LLMs both seem to create layered heuristics that can become pretty much arbitrarily complex, but it still makes more sense to follow an exact algorithm for some things.

[email protected]

But here’s the really funky bit. If you ask Claude how it got the correct answer of 95, it will apparently tell you, “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” But that actually only reflects common answers in its training data as to how the sum might be completed, as opposed to what it actually did.

This is not surprising. LLMs are not designed to have any introspection capabilities.

Introspection could probably be tacked onto existing architectures in a few different ways, but as far as I know nobody's done it yet. It will be interesting to see how that might change LLM behavior.

[email protected]

This is pretty normal, in my opinion. Every time people complain about common core arithmetic there are dozens of us who come out of the woodwork to argue that the concepts being taught are important for deeper understanding of math, beyond just rote memorization of pencil and paper algorithms.

[email protected]

72 * 10 + 70 * 3 + 2 * 3

That's what I do in my head if I need an exact result. If I'm approximateing I'll probably just do something like 70 * 15 which is much easier to compute (70 * 10 + 70 * 5 = 700 + 350 = 1050).

[email protected]

I think what's wild about it is that it really is surprisingly similar to how we actually think. It's very different from how a computer (calculator) would calculate it.

So it's not a strange method for humans but that's what makes it so fascinating, no?

[email protected]

OK, I've been willing to just let the examples roll even though most people are just describing how they'd do the calculation, not a process of gradual approximation, which was supposed to be the point of the way the LLM does it...

...but this one got me.

Seriously, you think 70x5 is easier to compute than 70x3? Not only is that a harder one to get to for me in the notoriously unfriendly 7 times table, but it's also further away from the correct answer and past the intuitive upper limit of 1000.

[email protected]

Yeah I caught that too, I'd be curious to know more about what specifically they meant by that.

Being able to link all of the words that have a similar meaning, say, nearby, close, adjacent, proximal, side-by-side, etc and realize they all share something in common could be done in many ways. Some would require an abstract understanding of what spatial distance actually is, an understanding of physical reality. Others would not, one could simply make use of word adjacency, noticing that all of these words are frequently used alongside certain other words. This would not be abstract, it'd be more of a simple sum of clear correlations. You could call this mathematical framework a universal language if you wanted.

Ultimately, a person learns meaning and then applies language to it. When I'm a baby I see my mother, and know my mother is something that exists. Then I learn the word "mother" and apply it to her. The abstract comes first. Can an LLM do something similar despite having never seen anything that isn't a word or number?

[email protected]

when a calculator from the 80s can do the same thing.

1970's! The little blighters are even older than most people think.

Which is why I find it extra hilarious / extra infuriating that we've gone through all of these contortions and huge wastes of computing power and electricity to ultimately just make a computer worse at math.

Math is the one thing that computers are inherently good at. It's what they're for. Trying to use LLM's to perform it halfassedly is a completely braindead endeavor.

agnos.is Forums

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought