Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Lemmy Shitpost
  3. AGI achieved 🤖

AGI achieved 🤖

Scheduled Pinned Locked Moved Lemmy Shitpost
lemmyshitpost
142 Posts 69 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • underpantsweevil@lemmy.worldU [email protected]

    LLM wasn’t made for this

    There's a thought experiment that challenges the concept of cognition, called The Chinese Room. What it essentially postulates is a conversation between two people, one of whom is speaking Chinese and getting responses in Chinese. And the first speaker wonders "Does my conversation partner really understand what I'm saying or am I just getting elaborate stock answers from a big library of pre-defined replies?"

    The LLM is literally a Chinese Room. And one way we can know this is through these interactions. The machine isn't analyzing the fundamental meaning of what I'm saying, it is simply mapping the words I've input onto a big catalog of responses and giving me a standard output. In this case, the problem the machine is running into is a legacy meme about people miscounting the number of "r"s in the word Strawberry. So "2" is the stock response it knows via the meme reference, even though a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately.

    When you hear people complain about how the LLM "wasn't made for this", what they're really complaining about is their own shitty methodology. They build a glorified card catalog. A device that can only take inputs, feed them through a massive library of responses, and sift out the highest probability answer without actually knowing what the inputs or outputs signify cognitively.

    Even if you want to argue that having a natural language search engine is useful (damn, wish we had a tool that did exactly this back in August of 1996, amirite?), the implementation of the current iteration of these tools is dogshit because the developers did a dogshit job of sanitizing and rationalizing their library of data. Also, incidentally, why Deepseek was running laps around OpenAI and Gemini as of last year.

    Imagine asking a librarian "What was happening in Los Angeles in the Summer of 1989?" and that person fetching you back a stack of history textbooks, a stack of Sci-Fi screenplays, a stack of regional newspapers, and a stack of Iron-Man comic books all given equal weight? Imagine hearing the plot of the Terminator and Escape from LA intercut with local elections and the Loma Prieta earthquake.

    That's modern LLMs in a nutshell.

    J This user is from outside of this forum
    J This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #92

    You've missed something about the Chinese Room. The solution to the Chinese Room riddle is that it is not the person in the room but rather the room itself that is communicating with you. The fact that there's a person there is irrelevant, and they could be replaced with a speaker or computer terminal.

    Put differently, it's not an indictment of LLMs that they are merely Chinese Rooms, but rather one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.

    If one day we discover that the human brain works on much simpler principles than we once thought, would that make humans any less valuable? It should be deeply troubling to us that LLMs can do so much while the mathematics behind them are so simple. Arguments that because LLMs are just scaled-up autocomplete they surely can't be very good at anything are not comforting to me at all.

    K underpantsweevil@lemmy.worldU 2 Replies Last reply
    1
    • J [email protected]

      People who think that LLMs having trouble with these questions is evidence one way or another about how good or bad LLMs are just don't understand tokenization. This is not a symptom of some big-picture deep problem with LLMs; it's a curious artifact like in a jpeg image, but doesn't really matter for the vast majority of applications.

      You may hate AI but that doesn't excuse being ignorant about how it works.

      U This user is from outside of this forum
      U This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #93

      These sorts of artifacts wouldn't be a huge issue except that AI is being pushed to the general public as an alternative means of learning basic information. The meme example is obvious to someone with a strong understanding of English but learners and children might get an artifact and stamp it in their memory, working for years off bad information. Not a problem for a few false things every now and then, that's unavoidable in learning. Thousands accumulated over long term use, however, and your understanding of the world will be coarser, like the Swiss cheese with voids so large it can't hold itself up.

      J 1 Reply Last reply
      8
      • J [email protected]

        People who think that LLMs having trouble with these questions is evidence one way or another about how good or bad LLMs are just don't understand tokenization. This is not a symptom of some big-picture deep problem with LLMs; it's a curious artifact like in a jpeg image, but doesn't really matter for the vast majority of applications.

        You may hate AI but that doesn't excuse being ignorant about how it works.

        _ This user is from outside of this forum
        _ This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #94

        And yet they can seemingly spell and count (small numbers) just fine.

        J B 2 Replies Last reply
        0
        • merc@sh.itjust.worksM [email protected]

          How do you pronounce "Mrs" so that there's an "r" sound in it?

          U This user is from outside of this forum
          U This user is from outside of this forum
          [email protected]
          wrote on last edited by [email protected]
          #95

          "His property"

          Otherwise it's just Ms.

          I 1 Reply Last reply
          0
          • underpantsweevil@lemmy.worldU [email protected]

            LLM wasn’t made for this

            There's a thought experiment that challenges the concept of cognition, called The Chinese Room. What it essentially postulates is a conversation between two people, one of whom is speaking Chinese and getting responses in Chinese. And the first speaker wonders "Does my conversation partner really understand what I'm saying or am I just getting elaborate stock answers from a big library of pre-defined replies?"

            The LLM is literally a Chinese Room. And one way we can know this is through these interactions. The machine isn't analyzing the fundamental meaning of what I'm saying, it is simply mapping the words I've input onto a big catalog of responses and giving me a standard output. In this case, the problem the machine is running into is a legacy meme about people miscounting the number of "r"s in the word Strawberry. So "2" is the stock response it knows via the meme reference, even though a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately.

            When you hear people complain about how the LLM "wasn't made for this", what they're really complaining about is their own shitty methodology. They build a glorified card catalog. A device that can only take inputs, feed them through a massive library of responses, and sift out the highest probability answer without actually knowing what the inputs or outputs signify cognitively.

            Even if you want to argue that having a natural language search engine is useful (damn, wish we had a tool that did exactly this back in August of 1996, amirite?), the implementation of the current iteration of these tools is dogshit because the developers did a dogshit job of sanitizing and rationalizing their library of data. Also, incidentally, why Deepseek was running laps around OpenAI and Gemini as of last year.

            Imagine asking a librarian "What was happening in Los Angeles in the Summer of 1989?" and that person fetching you back a stack of history textbooks, a stack of Sci-Fi screenplays, a stack of regional newspapers, and a stack of Iron-Man comic books all given equal weight? Imagine hearing the plot of the Terminator and Escape from LA intercut with local elections and the Loma Prieta earthquake.

            That's modern LLMs in a nutshell.

            R This user is from outside of this forum
            R This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #96

            That's a very long answer to my snarky little comment 🙂 I appreciate it though. Personally, I find LLMs interesting and I've spent quite a while playing with them. But after all they are like you described, an interconnected catalogue of random stuff, with some hallucinations to fill the gaps. They are NOT a reliable source of information or general knowledge or even safe to use as an "assistant". The marketing of LLMs as being fit for such purposes is the problem. Humans tend to turn off their brains and to blindly trust technology, and the tech companies are encouraging them to do so by making false promises.

            1 Reply Last reply
            1
            • Q [email protected]

              I really like checking these myself to make sure it’s true. I WAS NOT DISAPPOINTED!

              (Total Rs is 8. But the LOGIC ChatGPT pulls out is ……. remarkable!)

              zacryon@feddit.orgZ This user is from outside of this forum
              zacryon@feddit.orgZ This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #97

              "Let me know if you'd like help counting letters in any other fun words!"

              Oh well, these newish calls for engagement sure take on ridiculous extents sometimes.

              F 1 Reply Last reply
              3
              • merc@sh.itjust.worksM [email protected]

                How do you pronounce "Mrs" so that there's an "r" sound in it?

                I This user is from outside of this forum
                I This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #98

                I don’t, but it’s abbreviated with one.

                1 Reply Last reply
                1
                • J [email protected]

                  Machine learning algorithm from 2017, scaled up a few orders of magnitude so that it finally more or less works, then repackaged and sold by marketing teams.

                  softestsapphic@lemmy.worldS This user is from outside of this forum
                  softestsapphic@lemmy.worldS This user is from outside of this forum
                  [email protected]
                  wrote on last edited by [email protected]
                  #99

                  Adding weights doesn't make it a fundamentally different algorithm.

                  We have hit a wall where these programs have combed over the totality of the internet and all available datasets and texts in existence.

                  There isn't any more training data to improve with, and these programs have stated polluting the internet with bad data that will make them even dumber and incorrect in the long run.

                  We're done here until there's a fundamentally new approach that isn't repetitive training.

                  J 1 Reply Last reply
                  0
                  • U [email protected]

                    "His property"

                    Otherwise it's just Ms.

                    I This user is from outside of this forum
                    I This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #100

                    Mrs. originally comes from mistress, which is why it retains the r.

                    merc@sh.itjust.worksM U 2 Replies Last reply
                    0
                    • abfarid@startrek.websiteA [email protected]

                      I get the meme aspect of this. But just to be clear, it was never fair to judge LLMs for specifically this. The LLM doesn't even see the letters in the words, as every word is broken down into tokens, which are numbers. I suppose with a big enough corpus of data it might eventually extrapolate which words have which letter from texts describing these words, but normally it shouldn't be expected.

                      zacryon@feddit.orgZ This user is from outside of this forum
                      zacryon@feddit.orgZ This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #101

                      I know that words are tokenized in the vanilla transformer. But do GPT and similar LLMs still do that as well? I assumed they also tokenize on character/symbol level, possibly mixed up with additional abstraction down the chain.

                      1 Reply Last reply
                      0
                      • I [email protected]

                        Mrs. originally comes from mistress, which is why it retains the r.

                        merc@sh.itjust.worksM This user is from outside of this forum
                        merc@sh.itjust.worksM This user is from outside of this forum
                        [email protected]
                        wrote on last edited by
                        #102

                        But no "r" sound.

                        I 1 Reply Last reply
                        0
                        • I [email protected]

                          Mrs. originally comes from mistress, which is why it retains the r.

                          U This user is from outside of this forum
                          U This user is from outside of this forum
                          [email protected]
                          wrote on last edited by
                          #103

                          Yes but from same source also wife

                          I 1 Reply Last reply
                          0
                          • U [email protected]

                            These sorts of artifacts wouldn't be a huge issue except that AI is being pushed to the general public as an alternative means of learning basic information. The meme example is obvious to someone with a strong understanding of English but learners and children might get an artifact and stamp it in their memory, working for years off bad information. Not a problem for a few false things every now and then, that's unavoidable in learning. Thousands accumulated over long term use, however, and your understanding of the world will be coarser, like the Swiss cheese with voids so large it can't hold itself up.

                            J This user is from outside of this forum
                            J This user is from outside of this forum
                            [email protected]
                            wrote on last edited by
                            #104

                            You're talking about hallucinations. That's different from tokenization reflection errors. I'm specifically talking about its inability to know how many of a certain type of letter are in a word that it can spell correctly. This is not a hallucination per se -- at least, it's a completely different mechanism that causes it than whatever causes other factual errors. This specific problem is due to tokenization, and that's why I say it has little bearing on other shortcomings of LLMs.

                            U 1 Reply Last reply
                            0
                            • zacryon@feddit.orgZ [email protected]

                              "Let me know if you'd like help counting letters in any other fun words!"

                              Oh well, these newish calls for engagement sure take on ridiculous extents sometimes.

                              F This user is from outside of this forum
                              F This user is from outside of this forum
                              [email protected]
                              wrote on last edited by
                              #105

                              I want an option to select Marvin the paranoid android mood: "there's your answer, now if you could leave me to wallow in self-pitty"

                              L J 2 Replies Last reply
                              2
                              • _ [email protected]

                                And yet they can seemingly spell and count (small numbers) just fine.

                                J This user is from outside of this forum
                                J This user is from outside of this forum
                                [email protected]
                                wrote on last edited by
                                #106

                                what do you mean by spell fine? They're just emitting the tokens for the words. Like, it's not writing "strawberry," it's writing tokens <302, 1618, 19772>, which correspond to st, raw, and berry respectively. If you ask it to put a space between each letter, that will disrupt the tokenization mechanism, and it's going to be quite liable to making mistakes.

                                I don't think it's really fair to say that the lookup 19772 -> berry counts as the LLM being able to spell, since the LLM isn't operating at that layer. It doesn't really emit letters directly. I would argue its inability to reliably spell words when you force it to go letter-by-letter or answer queries about how words are spelled is indicative of its poor ability to spell.

                                _ 1 Reply Last reply
                                2
                                • U [email protected]

                                  Yes but from same source also wife

                                  I This user is from outside of this forum
                                  I This user is from outside of this forum
                                  [email protected]
                                  wrote on last edited by
                                  #107

                                  That came later though, as in “I had dinner with the Mrs last night.”

                                  U 1 Reply Last reply
                                  0
                                  • merc@sh.itjust.worksM [email protected]

                                    But no "r" sound.

                                    I This user is from outside of this forum
                                    I This user is from outside of this forum
                                    [email protected]
                                    wrote on last edited by
                                    #108

                                    Correct. I didn’t say there was an r sound, but that it was going off of the spelling. I agree there’s no r sound.

                                    1 Reply Last reply
                                    0
                                    • softestsapphic@lemmy.worldS [email protected]

                                      Adding weights doesn't make it a fundamentally different algorithm.

                                      We have hit a wall where these programs have combed over the totality of the internet and all available datasets and texts in existence.

                                      There isn't any more training data to improve with, and these programs have stated polluting the internet with bad data that will make them even dumber and incorrect in the long run.

                                      We're done here until there's a fundamentally new approach that isn't repetitive training.

                                      J This user is from outside of this forum
                                      J This user is from outside of this forum
                                      [email protected]
                                      wrote on last edited by [email protected]
                                      #109

                                      Transformers were pretty novel in 2017, I don't know if they were really around before that.

                                      Anyway, I'm doubtful that a larger corpus is what's needed at this point. (Though that said, there's a lot more text remaining in instant messager chat logs like discord that probably have yet to be integrated into LLMs. Not sure.) I'm also doubtful that scaling up is going to keep working, but it wouldn't surprise that much me if it does keep working for a long while. My guess is that there's some small tweaks to be discovered that really improve things a lot but still basically like like repetitive training as you put it. Who can really say though.

                                      1 Reply Last reply
                                      0
                                      • merc@sh.itjust.worksM [email protected]

                                        Imagine asking a librarian "What was happening in Los Angeles in the Summer of 1989?" and that person fetching you ... That's modern LLMs in a nutshell.

                                        I agree, but I think you're still being too generous to LLMs. A librarian who fetched all those things would at least understand the question. An LLM is just trying to generate words that might logically follow the words you used.

                                        IMO, one of the key ideas with the Chinese Room is that there's an assumption that the computer / book in the Chinese Room experiment has infinite capacity in some way. So, no matter what symbols are passed to it, it can come up with an appropriate response. But, obviously, while LLMs are incredibly huge, they can never be infinite. As a result, they can often be "fooled" when they're given input that semantically similar to a meme, joke or logic puzzle. The vast majority of the training data that matches the input is the meme, or joke, or logic puzzle. LLMs can't reason so they can't distinguish between "this is just a rephrasing of that meme" and "this is similar to that meme but distinct in an important way".

                                        J This user is from outside of this forum
                                        J This user is from outside of this forum
                                        [email protected]
                                        wrote on last edited by
                                        #110

                                        Can you explain the difference between understanding the question and generating the words that might logically follow? I'm aware that it's essentially a more powerful version of how auto-correct works, but why should we assume that shows some lack of understanding at a deep level somehow?

                                        1 Reply Last reply
                                        0
                                        • J [email protected]

                                          You're talking about hallucinations. That's different from tokenization reflection errors. I'm specifically talking about its inability to know how many of a certain type of letter are in a word that it can spell correctly. This is not a hallucination per se -- at least, it's a completely different mechanism that causes it than whatever causes other factual errors. This specific problem is due to tokenization, and that's why I say it has little bearing on other shortcomings of LLMs.

                                          U This user is from outside of this forum
                                          U This user is from outside of this forum
                                          [email protected]
                                          wrote on last edited by
                                          #111

                                          No, I'm talking about human learning and the danger imposed by treating an imperfect tool as a reliable source of information as these companies want people to do.

                                          Whether the erratic information is from tokenization or hallucinations is irrelevant when this is already the main source for so many people in their learning, for example, a new language.

                                          1 Reply Last reply
                                          2
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups