Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

[email protected]

I use a calculator. Which an AI should also be and not need to do weird shit to do math.

[email protected]

I mean it implies that they CAN start with the conclusion or the "thought" and then generate the text to verbalize that.

It's shocking to what length humans will go to explain how their wetware neural network is fundamentally different and it's impossible for LLMs to think or reason in any way. Honestly LLMs teach us more about human intelligence (or the lack thereof) than machine intelligence. Like obi wan said, "The ability to speak does not make one intelligent" haha.

[email protected]

No it hasn't. When you program you break down the problem into many smaller sub programs and then codify them. There are errors that need debugging. But never "how does this part of the program I wrote work?".

There are some cases like detergents, apparently until recently we didn't know exactly how it works. But human engineered tools are not comparable to this.

[email protected]

It's like that "Joey Repeat After Me" meme from friends haha

[email protected]

Better yet, teach AI to write code replacing specific optimized AI networks. Then automatically profile and optimize and unit test!

[email protected]

Fascist. If someone does maths differently than your preference, it's not "weird shit". I'm facile with mental math despite what's perhaps a non-standard approach, and it's quite functional to be able to perform simple to moderate levels of mathematics mentally without relying on a calculator.

[email protected]

Function calling is a thing chatbots can do now

[email protected]

Wtf hahahahaha

[email protected]

But who is going around asking these bots to specifically do math? Like in normal usage, Ive never once done that because I could just use a calculator or spreadsheet software if I need to get fancy lol

[email protected]

Someone put 69 to research and then to article. Nice trolling.

[email protected]

How I'd do it is basically

72 * (10+3)

(72 * 10) + (72 * 3)

(720) + (3*(70+2))

(720) + (210+6)

(720) + (216)

936

Basically I break the numbers apart into easier chunks and then add them together.

[email protected]

OK but the llm is evidently shit at math so its "non-standard" approach should still be adjusted

[email protected]

I am talking about the AI. It's already a computer. It shouldn't need to do anything other than calculate the equations. It doesn't have a brain, it doesn't think like a human, so it shouldn't need any special tools or ways to help it do math. It is a calculator, after all.

[email protected]

I wouldn't even attempt that in my head.
I can't keep track of things and then recall them later for the final result.

[email protected]

Anybody who claims they don't "think" before we even figure out completely how they work and even how human thoughts work are just spreading anti-AI sentiment beyond what is considered logical.

You should become a better example than an AI by only arguing based on facts rather than things you hallucinate if you want to prove your own position on this matter.

[email protected]

You're antropomorphising quite a bit there. It is not trying to be deceptive, it's building two mostly unrelated pieces of text and deciding the fuzzy logic is getting it the most likely valid response once and that the description of the algorithm is the most likely response to the other. As far as I can tell there's neither a reward for lying about the process nor any awareness of what the process was anywhere in this.

Still interesting (but unsurprising) that it's not getting there by doing actual maths, though.

[email protected]

I think it's odd in the sense that it's supposed to be software so it should already know what 36 plus 59 is in a picosecond, instead of doing mental arithmetics like we do

At least that's my takeaway

[email protected]

We also check to see if the word that popped into our heads actually rhymes by saying it out loud. Actual validation steps we can take is a bigger difference than being a little more robust.

We also have non-list based methods like breaking the word down into smaller chunks to try to build up hopefully more novel rhymes. I imagine professionals have even more tools, given the complexity of more modern rhyme schemes.

[email protected]

Yes, agreed. And calculators are essentially tabulators, and operate almost just like a skilled person using an abacus.

We shouldn't really be surprised because we designed these machines and programs based on our own human experiences and prior solutions to problems. It's still neat though.

? Offline

…Duh.

agnos.is Forums

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought