Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

[email protected]

Better yet, teach AI to write code replacing specific optimized AI networks. Then automatically profile and optimize and unit test!

[email protected]

Fascist. If someone does maths differently than your preference, it's not "weird shit". I'm facile with mental math despite what's perhaps a non-standard approach, and it's quite functional to be able to perform simple to moderate levels of mathematics mentally without relying on a calculator.

[email protected]

Function calling is a thing chatbots can do now

[email protected]

Wtf hahahahaha

[email protected]

But who is going around asking these bots to specifically do math? Like in normal usage, Ive never once done that because I could just use a calculator or spreadsheet software if I need to get fancy lol

[email protected]

Someone put 69 to research and then to article. Nice trolling.

[email protected]

How I'd do it is basically

72 * (10+3)

(72 * 10) + (72 * 3)

(720) + (3*(70+2))

(720) + (210+6)

(720) + (216)

936

Basically I break the numbers apart into easier chunks and then add them together.

[email protected]

OK but the llm is evidently shit at math so its "non-standard" approach should still be adjusted

[email protected]

I am talking about the AI. It's already a computer. It shouldn't need to do anything other than calculate the equations. It doesn't have a brain, it doesn't think like a human, so it shouldn't need any special tools or ways to help it do math. It is a calculator, after all.

[email protected]

I wouldn't even attempt that in my head.
I can't keep track of things and then recall them later for the final result.

[email protected]

Anybody who claims they don't "think" before we even figure out completely how they work and even how human thoughts work are just spreading anti-AI sentiment beyond what is considered logical.

You should become a better example than an AI by only arguing based on facts rather than things you hallucinate if you want to prove your own position on this matter.

[email protected]

You're antropomorphising quite a bit there. It is not trying to be deceptive, it's building two mostly unrelated pieces of text and deciding the fuzzy logic is getting it the most likely valid response once and that the description of the algorithm is the most likely response to the other. As far as I can tell there's neither a reward for lying about the process nor any awareness of what the process was anywhere in this.

Still interesting (but unsurprising) that it's not getting there by doing actual maths, though.

[email protected]

I think it's odd in the sense that it's supposed to be software so it should already know what 36 plus 59 is in a picosecond, instead of doing mental arithmetics like we do

At least that's my takeaway

[email protected]

We also check to see if the word that popped into our heads actually rhymes by saying it out loud. Actual validation steps we can take is a bigger difference than being a little more robust.

We also have non-list based methods like breaking the word down into smaller chunks to try to build up hopefully more novel rhymes. I imagine professionals have even more tools, given the complexity of more modern rhyme schemes.

[email protected]

Yes, agreed. And calculators are essentially tabulators, and operate almost just like a skilled person using an abacus.

We shouldn't really be surprised because we designed these machines and programs based on our own human experiences and prior solutions to problems. It's still neat though.

? Offline

…Duh.

[email protected]

My favourite part of the day: commenting LLMentalist under AI articles.

[email protected]

It also doesn't help that the AI companies deliberately use language to make their models seem more human-like and cogent. Saying that the model e.g. "thinks" in "conceptual spaces" is misleading imo. It abuses our innate tendency to anthropomorphize, which I guess is very fitting for a company with that name.

On this point I can highly recommend this open access and even language-wise accessible article: https://link.springer.com/article/10.1007/s10676-024-09775-5 (the authors also appear on an episode of the Better Offline podcast)

[email protected]

Pen and paper maths I'm pretty decent at, but ask me to calculate anything in my head and it's anyone's guess if I remembered to carry the 1 or not. Ever since learning about aphantasia I'm wondering if the lack of being able to visually store values has something to do with it.

[email protected]

Times 5 and times 10 tables are really easy for me. So yeah, in my mind it's an easier comuptation.

That being said having a result of a little over a 1000 gives me an estimate for the magnitude of a number – it's around a thousand. It might be more or less but it's not far from there.

agnos.is Forums

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought