Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Technology
  3. Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Scheduled Pinned Locked Moved Technology
technology
163 Posts 97 Posters 658 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L [email protected]

    But how is this different from your average redditor?

    S This user is from outside of this forum
    S This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #92

    Redditor as "a person active on Reddit"? I don't see where I was talking about humans. Or am I misunderstanding the question?

    A 1 Reply Last reply
    0
    • H [email protected]

      But here’s the really funky bit. If you ask Claude how it got the correct answer of 95, it will apparently tell you, “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” But that actually only reflects common answers in its training data as to how the sum might be completed, as opposed to what it actually did.

      This is not surprising. LLMs are not designed to have any introspection capabilities.

      Introspection could probably be tacked onto existing architectures in a few different ways, but as far as I know nobody's done it yet. It will be interesting to see how that might change LLM behavior.

      kshade@lemmy.worldK This user is from outside of this forum
      kshade@lemmy.worldK This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #93

      I'm surprised that they are surprised by this as well. What did they expect, and why? How much of this is written to imply LLMs - their business - are more advanced/capable than they actually are?

      1 Reply Last reply
      0
      • R [email protected]

        I think what's wild about it is that it really is surprisingly similar to how we actually think. It's very different from how a computer (calculator) would calculate it.

        So it's not a strange method for humans but that's what makes it so fascinating, no?

        P This user is from outside of this forum
        P This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #94

        I mean neural networks are modeled after biological neurons/brains after all. Kind of makes sense...

        1 Reply Last reply
        0
        • J [email protected]

          I think it's odd in the sense that it's supposed to be software so it should already know what 36 plus 59 is in a picosecond, instead of doing mental arithmetics like we do

          At least that's my takeaway

          S This user is from outside of this forum
          S This user is from outside of this forum
          [email protected]
          wrote on last edited by
          #95

          This is what the ARC-AGI test by Chollet has also shows regarding current AI / LLMs. They have a tendency to approach problems with this trial and error method and can be extremely inefficient (in their current form) with anything involving abstract / deductive reasoning.

          Most LLMs do terribly at the test with the most recent breakthrough being with reasoning models. But even the reasoning models struggle.

          ARC-AGI is simple, but it demands a keen sense of perception and, in some sense, judgment. It consists of a series of incomplete grids that the test-taker must color in based on the rules they deduce from a few examples; one might, for instance, see a sequence of images and observe that a blue tile is always surrounded by orange tiles, then complete the next picture accordingly. It’s not so different from paint by numbers.

          The test has long seemed intractable to major AI companies. GPT-4, which OpenAI boasted in 2023 had “advanced reasoning capabilities,” didn’t do much better than the zero percent earned by its predecessor. A year later, GPT-4o, which the start-up marketed as displaying “text, reasoning, and coding intelligence,” achieved only 5 percent. Gemini 1.5 and Claude 3.7, flagship models from Google and Anthropic, achieved 5 and 14 percent, respectively.

          G 1 Reply Last reply
          0
          • cm0002@lemmy.worldC [email protected]
            This post did not contain any content.
            T This user is from outside of this forum
            T This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #96

            'is weirder than you thought '

            I am as likely to click a link with that line as much as if it had

            'this one weird trick' or 'side hussle'.

            I would really like it if headlines treated us like adults and got rid of click baity lines.

            B B 2 Replies Last reply
            0
            • N [email protected]

              Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

              If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

              A This user is from outside of this forum
              A This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #97

              I don't think it knows the full sentence, it just doesn't search for the words in the order they will be in the sentence. It finds the end-words first to make the poem rhyme, than looks for the rest of the words. I do it this way as well just like many other people trying to create any kind of rhyming text.

              1 Reply Last reply
              0
              • N [email protected]

                Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

                If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

                F This user is from outside of this forum
                F This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #98

                interestingly, too, this is a technique when you're improvising songs, it's called Target Rhyming.

                The most effective way is to do A / B^1 / C / B^2 rhymes. You pick the B^2 rhyme, let's say, "ibruprofen" and you get all of A and B^1 to think of a rhyme

                Oh its Christmas time
                And I was up on my roof when
                I heard a jolly old voice
                Ask me for ibuprofen

                And the audience thinks you're fucking incredible for complex rhymes.

                1 Reply Last reply
                0
                • cm0002@lemmy.worldC [email protected]
                  This post did not contain any content.
                  C This user is from outside of this forum
                  C This user is from outside of this forum
                  [email protected]
                  wrote on last edited by
                  #99

                  you can't trust its explanations as to what it has just done.

                  I might have had a lucky guess, but this was basically my assumption. You can't ask LLMs how they work and get an answer coming from an internal understanding of themselves, because they have no 'internal' experience.

                  Unless you make a scanner like the one in the study, non-verbal processing is as much of a black box to their 'output voice' as it is to us.

                  C 1 Reply Last reply
                  0
                  • T [email protected]

                    'is weirder than you thought '

                    I am as likely to click a link with that line as much as if it had

                    'this one weird trick' or 'side hussle'.

                    I would really like it if headlines treated us like adults and got rid of click baity lines.

                    B This user is from outside of this forum
                    B This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #100

                    They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                    S T T E 4 Replies Last reply
                    0
                    • K This user is from outside of this forum
                      K This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #101

                      But you wouldn't multiply, say, 74*14 to get the answer.

                      manticore@lemmy.nzM N 2 Replies Last reply
                      0
                      • cm0002@lemmy.worldC [email protected]
                        This post did not contain any content.
                        S This user is from outside of this forum
                        S This user is from outside of this forum
                        [email protected]
                        wrote on last edited by
                        #102

                        Don't tell me that my thoughts aren't weird enough.

                        1 Reply Last reply
                        0
                        • B [email protected]

                          They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                          S This user is from outside of this forum
                          S This user is from outside of this forum
                          [email protected]
                          wrote on last edited by
                          #103

                          The one weird trick that makes clickbait work

                          1 Reply Last reply
                          0
                          • T [email protected]

                            'is weirder than you thought '

                            I am as likely to click a link with that line as much as if it had

                            'this one weird trick' or 'side hussle'.

                            I would really like it if headlines treated us like adults and got rid of click baity lines.

                            B This user is from outside of this forum
                            B This user is from outside of this forum
                            [email protected]
                            wrote on last edited by
                            #104

                            But then you wouldn't need to click on thir Ad infested shite website where 1-2 paragraphs worth of actual information is stretched into a giant essay so that they can show you more Ads the longer you scroll

                            T 1 Reply Last reply
                            0
                            • B [email protected]

                              They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                              T This user is from outside of this forum
                              T This user is from outside of this forum
                              [email protected]
                              wrote on last edited by
                              #105

                              It really is quite unfortunate, I wish titles do what titles are supposed to do instead of being baits.but you are right, even consciously trying to avoid clicking sometimes curiosity gets the best of me. But I am improving.

                              1 Reply Last reply
                              0
                              • G [email protected]

                                This is pretty normal, in my opinion. Every time people complain about common core arithmetic there are dozens of us who come out of the woodwork to argue that the concepts being taught are important for deeper understanding of math, beyond just rote memorization of pencil and paper algorithms.

                                quarterswede@lemmy.worldQ This user is from outside of this forum
                                quarterswede@lemmy.worldQ This user is from outside of this forum
                                [email protected]
                                wrote on last edited by
                                #106

                                The problem with common core math isn’t that rounding is inherently bad, it’s that you don’t start with that as a framework.

                                1 Reply Last reply
                                0
                                • K [email protected]

                                  But you wouldn't multiply, say, 74*14 to get the answer.

                                  manticore@lemmy.nzM This user is from outside of this forum
                                  manticore@lemmy.nzM This user is from outside of this forum
                                  [email protected]
                                  wrote on last edited by
                                  #107

                                  I might. Then I can subtract 74 to get 74*14, and subtract 28 to get 72*13.

                                  I don't generally do that to 'weird' numbers, I usually get closer to multiples of 5, 9, 10, or 11.

                                  But a computer stores information differently. Perhaps it moves closer to numbers with simpler binary addresses.

                                  1 Reply Last reply
                                  0
                                  • gormadt@lemmy.blahaj.zoneG [email protected]

                                    How I'd do it is basically

                                    72 * (10+3)

                                    (72 * 10) + (72 * 3)

                                    (720) + (3*(70+2))

                                    (720) + (210+6)

                                    (720) + (216)

                                    936

                                    Basically I break the numbers apart into easier chunks and then add them together.

                                    manticore@lemmy.nzM This user is from outside of this forum
                                    manticore@lemmy.nzM This user is from outside of this forum
                                    [email protected]
                                    wrote on last edited by
                                    #108

                                    This is what I do, except I would add 700 and 236 at the end.

                                    Well except I would probably add 700 and 116 or something, because my working memory fucking sucks and my brain drops digits very easily when there's more than 1

                                    1 Reply Last reply
                                    0
                                    • mudman@fedia.ioM [email protected]

                                      You're antropomorphising quite a bit there. It is not trying to be deceptive, it's building two mostly unrelated pieces of text and deciding the fuzzy logic is getting it the most likely valid response once and that the description of the algorithm is the most likely response to the other. As far as I can tell there's neither a reward for lying about the process nor any awareness of what the process was anywhere in this.

                                      Still interesting (but unsurprising) that it's not getting there by doing actual maths, though.

                                      N This user is from outside of this forum
                                      N This user is from outside of this forum
                                      [email protected]
                                      wrote on last edited by
                                      #109

                                      Maybe you're right. Maybe it's Markov chains all the way down.

                                      The only way I can think to test this would be to "poison" the training data with faulty arithmetic to see if it is just recalling precedent or actually implementing an algorithm.

                                      1 Reply Last reply
                                      0
                                      • I [email protected]

                                        "Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95," the MIT article explains."

                                        That is precisrly how I do math. Feel a little targeted that they called this odd.

                                        E This user is from outside of this forum
                                        E This user is from outside of this forum
                                        [email protected]
                                        wrote on last edited by
                                        #110

                                        But you're doing two calculations now, an approximate one and another one on the last digits, since you're going to do the approximate calculation you might act as well just do the accurate calculation and be done in one step.

                                        This solution, while it works, has the feeling of evolution. No intelligent design, which I suppose makes sense considering the AI did essentially evolve.

                                        I S 2 Replies Last reply
                                        0
                                        • E [email protected]

                                          But you're doing two calculations now, an approximate one and another one on the last digits, since you're going to do the approximate calculation you might act as well just do the accurate calculation and be done in one step.

                                          This solution, while it works, has the feeling of evolution. No intelligent design, which I suppose makes sense considering the AI did essentially evolve.

                                          I This user is from outside of this forum
                                          I This user is from outside of this forum
                                          [email protected]
                                          wrote on last edited by
                                          #111

                                          Appreciate the advice on how my brain should work.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups