Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

[email protected]

The 7 times table is unfriendly?

I love 7 timeses. If numbers were sentient, I think I could be friends with 7.

[email protected]

Genuine question regarding the rhyme thing, it can be argued that "predicting backwards isn't very different" but you can't attribute generating the rhyme first to noise, right? So how does it "know" (for lack of a better word) to generate the rhyme first?

[email protected]

I've always hated it and eight. I can only remember the ones that are familiar at a glance from the reverse table and to this day I sometimes just sum up and down from those "anchor" references. They're so weird and slippery.

[email protected]

Huh.

Going back to the "being friends" thing, I think you and I could be friends due to applying qualities to numbers; but I think it might be challenging because I find 7 and 8 to be two of the best. They're quirky, but interesting.

Thank you for the insight.

[email protected]

It already knows which words are, statistically, more commonly rhymed with each other. From the massive list of training poems. This is what the massive data sets are for. One of the interesting things is that it's not predicting backwards, exactly. It's actually mathematically converging on the response text to the prompt, all the words at the same time.

[email protected]

For me personally, anything times 5 can be reached by halving the number, then multiplying that number by 10.

Example: 66 x 5 = Y

(66/2) x (5x2) = Y
- cancel out the division by creating equal multiplication in the other number
- 66/2 = 33
- 5x2 = 10
33 x 10 = Y
33 x 10 = 330
Y = 330

[email protected]

Probably, given that LLMs only exist in the domain of language, still interesting that they seem to have a "conceptual" systems that is commonly shared between languages.

[email protected]

See, for me, it’s not that 7*5 is easier to compute than 7*3, it’s that 5*7 is easier to compute than 7*3.

I saw your other comment about 8’s, too, and I’ve always found those to be a pain, so I reverse them, if not outright convert them to arithmetic problems. 8x4 is some unknown value, but X*8 is always X*10-2X, although do have most of the multiplication tables memorized for lower values.
8*7 is an unknown number that only the wisest sages can compute, however.

[email protected]

I don't think that's really a fair comparison, babies exist with images and sounds for over a year before they begin to learn language, so it would make sense that they begin to understand the world in non-linguistic terms and then apply language to that. LLMs only exist in relation to language so couldnt understand a concept separately to language, it would be like asking a person to conceptualise radio waves prior to having heard about them.

[email protected]

Exactly. It's sort of like a massively scaled up example of the blind man and the elephant.

[email protected]

Unfortunately, these articles are often written by people who don't know enough to realize they're missing important nuances.

[email protected]

"Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95," the MIT article explains."

That is precisrly how I do math. Feel a little targeted that they called this odd.

[email protected]

I think a lot of services are doing this behind the scenes already. Otherwise chatgpt would be getting basic arithmetic wrong a lot more considering the methods the article has shown it's using.

[email protected]

Rote memorization should be minimized in school curriculum

[email protected]

Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

[email protected]

(72 * 10) + (2 * 3) = x

There, fixed, because otherwise order of operation gets fucky.

[email protected]

Then take that concept further, and let it keep introspecting and inspecting how it comes to the conclusions it does and eventually....

[email protected]

Which is exactly how we do it.

[email protected]

You know they don't think - even though "It's a peculiar truth that we don't understand how large language models (LLMs) actually work."?

It's truly shocking to read this from a mess of connected neurons and synapses like yourself. You're simply doing fancy word prediction of the next word /s

[email protected]

But how is this different from your average redditor?

agnos.is Forums

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought