Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Technology
  3. Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Scheduled Pinned Locked Moved Technology
technology
163 Posts 97 Posters 658 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R [email protected]

    I think what's wild about it is that it really is surprisingly similar to how we actually think. It's very different from how a computer (calculator) would calculate it.

    So it's not a strange method for humans but that's what makes it so fascinating, no?

    P This user is from outside of this forum
    P This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #94

    I mean neural networks are modeled after biological neurons/brains after all. Kind of makes sense...

    1 Reply Last reply
    0
    • J [email protected]

      I think it's odd in the sense that it's supposed to be software so it should already know what 36 plus 59 is in a picosecond, instead of doing mental arithmetics like we do

      At least that's my takeaway

      S This user is from outside of this forum
      S This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #95

      This is what the ARC-AGI test by Chollet has also shows regarding current AI / LLMs. They have a tendency to approach problems with this trial and error method and can be extremely inefficient (in their current form) with anything involving abstract / deductive reasoning.

      Most LLMs do terribly at the test with the most recent breakthrough being with reasoning models. But even the reasoning models struggle.

      ARC-AGI is simple, but it demands a keen sense of perception and, in some sense, judgment. It consists of a series of incomplete grids that the test-taker must color in based on the rules they deduce from a few examples; one might, for instance, see a sequence of images and observe that a blue tile is always surrounded by orange tiles, then complete the next picture accordingly. It’s not so different from paint by numbers.

      The test has long seemed intractable to major AI companies. GPT-4, which OpenAI boasted in 2023 had “advanced reasoning capabilities,” didn’t do much better than the zero percent earned by its predecessor. A year later, GPT-4o, which the start-up marketed as displaying “text, reasoning, and coding intelligence,” achieved only 5 percent. Gemini 1.5 and Claude 3.7, flagship models from Google and Anthropic, achieved 5 and 14 percent, respectively.

      G 1 Reply Last reply
      0
      • cm0002@lemmy.worldC [email protected]
        This post did not contain any content.
        T This user is from outside of this forum
        T This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #96

        'is weirder than you thought '

        I am as likely to click a link with that line as much as if it had

        'this one weird trick' or 'side hussle'.

        I would really like it if headlines treated us like adults and got rid of click baity lines.

        B B 2 Replies Last reply
        0
        • N [email protected]

          Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

          If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

          A This user is from outside of this forum
          A This user is from outside of this forum
          [email protected]
          wrote on last edited by
          #97

          I don't think it knows the full sentence, it just doesn't search for the words in the order they will be in the sentence. It finds the end-words first to make the poem rhyme, than looks for the rest of the words. I do it this way as well just like many other people trying to create any kind of rhyming text.

          1 Reply Last reply
          0
          • N [email protected]

            Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

            If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

            F This user is from outside of this forum
            F This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #98

            interestingly, too, this is a technique when you're improvising songs, it's called Target Rhyming.

            The most effective way is to do A / B^1 / C / B^2 rhymes. You pick the B^2 rhyme, let's say, "ibruprofen" and you get all of A and B^1 to think of a rhyme

            Oh its Christmas time
            And I was up on my roof when
            I heard a jolly old voice
            Ask me for ibuprofen

            And the audience thinks you're fucking incredible for complex rhymes.

            1 Reply Last reply
            0
            • cm0002@lemmy.worldC [email protected]
              This post did not contain any content.
              C This user is from outside of this forum
              C This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #99

              you can't trust its explanations as to what it has just done.

              I might have had a lucky guess, but this was basically my assumption. You can't ask LLMs how they work and get an answer coming from an internal understanding of themselves, because they have no 'internal' experience.

              Unless you make a scanner like the one in the study, non-verbal processing is as much of a black box to their 'output voice' as it is to us.

              C 1 Reply Last reply
              0
              • T [email protected]

                'is weirder than you thought '

                I am as likely to click a link with that line as much as if it had

                'this one weird trick' or 'side hussle'.

                I would really like it if headlines treated us like adults and got rid of click baity lines.

                B This user is from outside of this forum
                B This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #100

                They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                S T T E 4 Replies Last reply
                0
                • K This user is from outside of this forum
                  K This user is from outside of this forum
                  [email protected]
                  wrote on last edited by
                  #101

                  But you wouldn't multiply, say, 74*14 to get the answer.

                  manticore@lemmy.nzM N 2 Replies Last reply
                  0
                  • cm0002@lemmy.worldC [email protected]
                    This post did not contain any content.
                    S This user is from outside of this forum
                    S This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #102

                    Don't tell me that my thoughts aren't weird enough.

                    1 Reply Last reply
                    0
                    • B [email protected]

                      They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                      S This user is from outside of this forum
                      S This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #103

                      The one weird trick that makes clickbait work

                      1 Reply Last reply
                      0
                      • T [email protected]

                        'is weirder than you thought '

                        I am as likely to click a link with that line as much as if it had

                        'this one weird trick' or 'side hussle'.

                        I would really like it if headlines treated us like adults and got rid of click baity lines.

                        B This user is from outside of this forum
                        B This user is from outside of this forum
                        [email protected]
                        wrote on last edited by
                        #104

                        But then you wouldn't need to click on thir Ad infested shite website where 1-2 paragraphs worth of actual information is stretched into a giant essay so that they can show you more Ads the longer you scroll

                        T 1 Reply Last reply
                        0
                        • B [email protected]

                          They do it because it works on the whole. If straight titles were as effective they'd be used instead.

                          T This user is from outside of this forum
                          T This user is from outside of this forum
                          [email protected]
                          wrote on last edited by
                          #105

                          It really is quite unfortunate, I wish titles do what titles are supposed to do instead of being baits.but you are right, even consciously trying to avoid clicking sometimes curiosity gets the best of me. But I am improving.

                          1 Reply Last reply
                          0
                          • G [email protected]

                            This is pretty normal, in my opinion. Every time people complain about common core arithmetic there are dozens of us who come out of the woodwork to argue that the concepts being taught are important for deeper understanding of math, beyond just rote memorization of pencil and paper algorithms.

                            quarterswede@lemmy.worldQ This user is from outside of this forum
                            quarterswede@lemmy.worldQ This user is from outside of this forum
                            [email protected]
                            wrote on last edited by
                            #106

                            The problem with common core math isn’t that rounding is inherently bad, it’s that you don’t start with that as a framework.

                            1 Reply Last reply
                            0
                            • K [email protected]

                              But you wouldn't multiply, say, 74*14 to get the answer.

                              manticore@lemmy.nzM This user is from outside of this forum
                              manticore@lemmy.nzM This user is from outside of this forum
                              [email protected]
                              wrote on last edited by
                              #107

                              I might. Then I can subtract 74 to get 74*14, and subtract 28 to get 72*13.

                              I don't generally do that to 'weird' numbers, I usually get closer to multiples of 5, 9, 10, or 11.

                              But a computer stores information differently. Perhaps it moves closer to numbers with simpler binary addresses.

                              1 Reply Last reply
                              0
                              • gormadt@lemmy.blahaj.zoneG [email protected]

                                How I'd do it is basically

                                72 * (10+3)

                                (72 * 10) + (72 * 3)

                                (720) + (3*(70+2))

                                (720) + (210+6)

                                (720) + (216)

                                936

                                Basically I break the numbers apart into easier chunks and then add them together.

                                manticore@lemmy.nzM This user is from outside of this forum
                                manticore@lemmy.nzM This user is from outside of this forum
                                [email protected]
                                wrote on last edited by
                                #108

                                This is what I do, except I would add 700 and 236 at the end.

                                Well except I would probably add 700 and 116 or something, because my working memory fucking sucks and my brain drops digits very easily when there's more than 1

                                1 Reply Last reply
                                0
                                • mudman@fedia.ioM [email protected]

                                  You're antropomorphising quite a bit there. It is not trying to be deceptive, it's building two mostly unrelated pieces of text and deciding the fuzzy logic is getting it the most likely valid response once and that the description of the algorithm is the most likely response to the other. As far as I can tell there's neither a reward for lying about the process nor any awareness of what the process was anywhere in this.

                                  Still interesting (but unsurprising) that it's not getting there by doing actual maths, though.

                                  N This user is from outside of this forum
                                  N This user is from outside of this forum
                                  [email protected]
                                  wrote on last edited by
                                  #109

                                  Maybe you're right. Maybe it's Markov chains all the way down.

                                  The only way I can think to test this would be to "poison" the training data with faulty arithmetic to see if it is just recalling precedent or actually implementing an algorithm.

                                  1 Reply Last reply
                                  0
                                  • I [email protected]

                                    "Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95," the MIT article explains."

                                    That is precisrly how I do math. Feel a little targeted that they called this odd.

                                    E This user is from outside of this forum
                                    E This user is from outside of this forum
                                    [email protected]
                                    wrote on last edited by
                                    #110

                                    But you're doing two calculations now, an approximate one and another one on the last digits, since you're going to do the approximate calculation you might act as well just do the accurate calculation and be done in one step.

                                    This solution, while it works, has the feeling of evolution. No intelligent design, which I suppose makes sense considering the AI did essentially evolve.

                                    I S 2 Replies Last reply
                                    0
                                    • E [email protected]

                                      But you're doing two calculations now, an approximate one and another one on the last digits, since you're going to do the approximate calculation you might act as well just do the accurate calculation and be done in one step.

                                      This solution, while it works, has the feeling of evolution. No intelligent design, which I suppose makes sense considering the AI did essentially evolve.

                                      I This user is from outside of this forum
                                      I This user is from outside of this forum
                                      [email protected]
                                      wrote on last edited by
                                      #111

                                      Appreciate the advice on how my brain should work.

                                      1 Reply Last reply
                                      0
                                      • K [email protected]

                                        But you wouldn't multiply, say, 74*14 to get the answer.

                                        N This user is from outside of this forum
                                        N This user is from outside of this forum
                                        [email protected]
                                        wrote on last edited by
                                        #112

                                        Not, but I'd do 7510 + 754, then subtract the extra.

                                        The LLM method of doing it with multiple numbers without proper interpolation though makes it extra weird

                                        1 Reply Last reply
                                        0
                                        • V [email protected]

                                          It really doesn't. You're just describing the "fancy" part of "fancy autocomplete." No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

                                          What's being conveyed by "fancy autocomplete" is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more "creative" (meaning more random, less probable) outputs. They do not actually "think" as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It's not actually applying a structured, logical method the way humans can be taught to.

                                          A This user is from outside of this forum
                                          A This user is from outside of this forum
                                          [email protected]
                                          wrote on last edited by
                                          #113

                                          People are generally shit at understanding probabilities and even when they have a fairly strong math background tend to explain probablistic outcomes through anthropomorphism rather than doing the more difficult and "think-painy" statistical analysis that would be required to know if there was anything more to it.

                                          I myself start to have thoughts that balatro is purposefully screwing me over or feeding me outcomes when it's just randomness and probability as stated.

                                          Ultimately, it's easier (and more fun) for us to reason that way and it largely serves us better in everyday life.

                                          But these things are entire casinos' worth of probability and statistics in and of themselves, and the people developing them want desperately to believe that they are something more than pseudorandom probabilistic fancy autocomplete engines.

                                          Add the difficulty of getting someone to understand how something works when their salary depends on them not understanding it to the existing inability of humans to reason probabilistically and the AGI from LLM delusion becomes near impossible to shake for some folks.

                                          I wouldn't be surprised if this AI hype bubble yields a cult in the end.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups