Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Technology
  3. Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

Scheduled Pinned Locked Moved Technology
technology
163 Posts 97 Posters 658 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • mudman@fedia.ioM [email protected]

    OK, I've been willing to just let the examples roll even though most people are just describing how they'd do the calculation, not a process of gradual approximation, which was supposed to be the point of the way the LLM does it...

    ...but this one got me.

    Seriously, you think 70x5 is easier to compute than 70x3? Not only is that a harder one to get to for me in the notoriously unfriendly 7 times table, but it's also further away from the correct answer and past the intuitive upper limit of 1000.

    T This user is from outside of this forum
    T This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #41

    The 7 times table is unfriendly?

    I love 7 timeses. If numbers were sentient, I think I could be friends with 7.

    mudman@fedia.ioM 1 Reply Last reply
    0
    • V [email protected]

      It really doesn't. You're just describing the "fancy" part of "fancy autocomplete." No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

      What's being conveyed by "fancy autocomplete" is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more "creative" (meaning more random, less probable) outputs. They do not actually "think" as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It's not actually applying a structured, logical method the way humans can be taught to.

      R This user is from outside of this forum
      R This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #42

      Genuine question regarding the rhyme thing, it can be argued that "predicting backwards isn't very different" but you can't attribute generating the rhyme first to noise, right? So how does it "know" (for lack of a better word) to generate the rhyme first?

      D 1 Reply Last reply
      0
      • T [email protected]

        The 7 times table is unfriendly?

        I love 7 timeses. If numbers were sentient, I think I could be friends with 7.

        mudman@fedia.ioM This user is from outside of this forum
        mudman@fedia.ioM This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #43

        I've always hated it and eight. I can only remember the ones that are familiar at a glance from the reverse table and to this day I sometimes just sum up and down from those "anchor" references. They're so weird and slippery.

        T 1 Reply Last reply
        0
        • mudman@fedia.ioM [email protected]

          I've always hated it and eight. I can only remember the ones that are familiar at a glance from the reverse table and to this day I sometimes just sum up and down from those "anchor" references. They're so weird and slippery.

          T This user is from outside of this forum
          T This user is from outside of this forum
          [email protected]
          wrote on last edited by
          #44

          Huh.

          Going back to the "being friends" thing, I think you and I could be friends due to applying qualities to numbers; but I think it might be challenging because I find 7 and 8 to be two of the best. They're quirky, but interesting.

          Thank you for the insight.

          1 Reply Last reply
          0
          • R [email protected]

            Genuine question regarding the rhyme thing, it can be argued that "predicting backwards isn't very different" but you can't attribute generating the rhyme first to noise, right? So how does it "know" (for lack of a better word) to generate the rhyme first?

            D This user is from outside of this forum
            D This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #45

            It already knows which words are, statistically, more commonly rhymed with each other. From the massive list of training poems. This is what the massive data sets are for. One of the interesting things is that it's not predicting backwards, exactly. It's actually mathematically converging on the response text to the prompt, all the words at the same time.

            semperverus@lemmy.worldS 1 Reply Last reply
            0
            • broadfern@lemmy.worldB This user is from outside of this forum
              broadfern@lemmy.worldB This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #46

              For me personally, anything times 5 can be reached by halving the number, then multiplying that number by 10.

              Example: 66 x 5 = Y

              • (66/2) x (5x2) = Y

                • cancel out the division by creating equal multiplication in the other number

                • 66/2 = 33

                • 5x2 = 10

              • 33 x 10 = Y

              • 33 x 10 = 330

              • Y = 330

              1 Reply Last reply
              0
              • M [email protected]

                Yeah but I think this is still the same, just not a single language. It might think in some mix of languages (which you can actuaysee sometimes if you push certain LLMs to their limit and they start producing mixed language responses.)

                But it still has limitations because of the structure in language. This is actually a thing that humans have as well, the limiting of abstract thought through internal monologue thinking

                W This user is from outside of this forum
                W This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #47

                Probably, given that LLMs only exist in the domain of language, still interesting that they seem to have a "conceptual" systems that is commonly shared between languages.

                1 Reply Last reply
                0
                • M This user is from outside of this forum
                  M This user is from outside of this forum
                  [email protected]
                  wrote on last edited by
                  #48

                  See, for me, it’s not that 7*5 is easier to compute than 7*3, it’s that 5*7 is easier to compute than 7*3.

                  I saw your other comment about 8’s, too, and I’ve always found those to be a pain, so I reverse them, if not outright convert them to arithmetic problems. 8x4 is some unknown value, but X*8 is always X*10-2X, although do have most of the multiplication tables memorized for lower values.
                  8*7 is an unknown number that only the wisest sages can compute, however.

                  1 Reply Last reply
                  0
                  • C [email protected]

                    Yeah I caught that too, I'd be curious to know more about what specifically they meant by that.

                    Being able to link all of the words that have a similar meaning, say, nearby, close, adjacent, proximal, side-by-side, etc and realize they all share something in common could be done in many ways. Some would require an abstract understanding of what spatial distance actually is, an understanding of physical reality. Others would not, one could simply make use of word adjacency, noticing that all of these words are frequently used alongside certain other words. This would not be abstract, it'd be more of a simple sum of clear correlations. You could call this mathematical framework a universal language if you wanted.

                    Ultimately, a person learns meaning and then applies language to it. When I'm a baby I see my mother, and know my mother is something that exists. Then I learn the word "mother" and apply it to her. The abstract comes first. Can an LLM do something similar despite having never seen anything that isn't a word or number?

                    W This user is from outside of this forum
                    W This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #49

                    I don't think that's really a fair comparison, babies exist with images and sounds for over a year before they begin to learn language, so it would make sense that they begin to understand the world in non-linguistic terms and then apply language to that. LLMs only exist in relation to language so couldnt understand a concept separately to language, it would be like asking a person to conceptualise radio waves prior to having heard about them.

                    C 1 Reply Last reply
                    0
                    • W [email protected]

                      I don't think that's really a fair comparison, babies exist with images and sounds for over a year before they begin to learn language, so it would make sense that they begin to understand the world in non-linguistic terms and then apply language to that. LLMs only exist in relation to language so couldnt understand a concept separately to language, it would be like asking a person to conceptualise radio waves prior to having heard about them.

                      C This user is from outside of this forum
                      C This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #50

                      Exactly. It's sort of like a massively scaled up example of the blind man and the elephant.

                      1 Reply Last reply
                      0
                      • V [email protected]

                        It really doesn't. You're just describing the "fancy" part of "fancy autocomplete." No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

                        What's being conveyed by "fancy autocomplete" is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more "creative" (meaning more random, less probable) outputs. They do not actually "think" as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It's not actually applying a structured, logical method the way humans can be taught to.

                        F This user is from outside of this forum
                        F This user is from outside of this forum
                        [email protected]
                        wrote on last edited by
                        #51

                        Unfortunately, these articles are often written by people who don't know enough to realize they're missing important nuances.

                        D 1 Reply Last reply
                        0
                        • cm0002@lemmy.worldC [email protected]
                          This post did not contain any content.
                          I This user is from outside of this forum
                          I This user is from outside of this forum
                          [email protected]
                          wrote on last edited by
                          #52

                          "Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95," the MIT article explains."

                          That is precisrly how I do math. Feel a little targeted that they called this odd.

                          kolanaki@pawb.socialK J E 3 Replies Last reply
                          0
                          • N This user is from outside of this forum
                            N This user is from outside of this forum
                            [email protected]
                            wrote on last edited by
                            #53

                            I think a lot of services are doing this behind the scenes already. Otherwise chatgpt would be getting basic arithmetic wrong a lot more considering the methods the article has shown it's using.

                            1 Reply Last reply
                            0
                            • G [email protected]

                              This is pretty normal, in my opinion. Every time people complain about common core arithmetic there are dozens of us who come out of the woodwork to argue that the concepts being taught are important for deeper understanding of math, beyond just rote memorization of pencil and paper algorithms.

                              F This user is from outside of this forum
                              F This user is from outside of this forum
                              [email protected]
                              wrote on last edited by
                              #54

                              Rote memorization should be minimized in school curriculum

                              F 1 Reply Last reply
                              0
                              • cm0002@lemmy.worldC [email protected]
                                This post did not contain any content.
                                N This user is from outside of this forum
                                N This user is from outside of this forum
                                [email protected]
                                wrote on last edited by
                                #55

                                Another very surprising outcome of the research is the discovery that these LLMs do not, as is widely assumed, operate by merely predicting the next word. By tracing how Claude generated rhyming couplets, Anthropic found that it chose the rhyming word at the end of verses first, then filled in the rest of the line.

                                If the llm already knows the full sentence it's going to output from the first word it "guesses" I wonder if you could short circuit it and say just give the full sentence instead of doing a cycle for each word of the sentence, could maybe cut down on llm energy costs.

                                A F 2 Replies Last reply
                                0
                                • M [email protected]

                                  72 * 10 + 70 * 3 + 2 * 3

                                  That's what I do in my head if I need an exact result. If I'm approximateing I'll probably just do something like 70 * 15 which is much easier to compute (70 * 10 + 70 * 5 = 700 + 350 = 1050).

                                  S This user is from outside of this forum
                                  S This user is from outside of this forum
                                  [email protected]
                                  wrote on last edited by
                                  #56

                                  (72 * 10) + (2 * 3) = x

                                  There, fixed, because otherwise order of operation gets fucky.

                                  M 1 Reply Last reply
                                  0
                                  • H [email protected]

                                    But here’s the really funky bit. If you ask Claude how it got the correct answer of 95, it will apparently tell you, “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” But that actually only reflects common answers in its training data as to how the sum might be completed, as opposed to what it actually did.

                                    This is not surprising. LLMs are not designed to have any introspection capabilities.

                                    Introspection could probably be tacked onto existing architectures in a few different ways, but as far as I know nobody's done it yet. It will be interesting to see how that might change LLM behavior.

                                    S This user is from outside of this forum
                                    S This user is from outside of this forum
                                    [email protected]
                                    wrote on last edited by
                                    #57

                                    Then take that concept further, and let it keep introspecting and inspecting how it comes to the conclusions it does and eventually....

                                    1 Reply Last reply
                                    0
                                    • D [email protected]

                                      It already knows which words are, statistically, more commonly rhymed with each other. From the massive list of training poems. This is what the massive data sets are for. One of the interesting things is that it's not predicting backwards, exactly. It's actually mathematically converging on the response text to the prompt, all the words at the same time.

                                      semperverus@lemmy.worldS This user is from outside of this forum
                                      semperverus@lemmy.worldS This user is from outside of this forum
                                      [email protected]
                                      wrote on last edited by
                                      #58

                                      Which is exactly how we do it.

                                      thisisnothim@sopuli.xyzT 1 Reply Last reply
                                      0
                                      • funnyusername@lemmy.worldF [email protected]

                                        anything that claims it "thinks" in any way I immediately dismiss as an advertisement of some sort. these models are doing very interesting things, but it is in no way "thinking" as a sentient mind does.

                                        L This user is from outside of this forum
                                        L This user is from outside of this forum
                                        [email protected]
                                        wrote on last edited by
                                        #59

                                        You know they don't think - even though "It's a peculiar truth that we don't understand how large language models (LLMs) actually work."?

                                        It's truly shocking to read this from a mess of connected neurons and synapses like yourself. You're simply doing fancy word prediction of the next word /s

                                        1 Reply Last reply
                                        0
                                        • S [email protected]

                                          It doesn't, who the hell cares if someone allowed it to break "predict whole text" into "predict part by part, and then "with rhyme, we start at the end". Sounds like a naive (not as in "simplistic", but as "most straightforward") way to code this, so given the task to write an automatic poetry producer, I would start with something similar. The whole thing still stands as fancy auto-complete

                                          L This user is from outside of this forum
                                          L This user is from outside of this forum
                                          [email protected]
                                          wrote on last edited by
                                          #60

                                          But how is this different from your average redditor?

                                          S 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups