Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Ask Lemmy
  3. Can we trust LLM CALCULATIONS?.

Can we trust LLM CALCULATIONS?.

Scheduled Pinned Locked Moved Ask Lemmy
asklemmy
69 Posts 48 Posters 3 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F [email protected]

    I'm a little confused after listening to a podcast with.... Damn I can't remember his name. He's English. They call him the godfather of AI. A pioneer.

    Well, he believes that gpt 2-4 were major breakthroughs in artificial infection. He specifically said chat gpt is intelligent. That some type of reasoning is taking place. The end of humanity could come in a year to 50 years away. If the fella who imagined a Neural net that is mapped using the human brain. And this man says it is doing much more. Who should I listen too?. He didn't say hidden AI. HE SAID CHAT GPT. HONESTLY ON OFFENSE. I JUST DON'T UNDERSTAND THIS EPIC SCENARIO ON ONE SIDE AND TOTALLY NOTHING ON THE OTHER

    G This user is from outside of this forum
    G This user is from outside of this forum
    [email protected]
    wrote last edited by
    #36

    Anyone with a stake in the development of AI is lying to you about how good models are and how soon they will be able to do X.

    They have to be lying because the truth is that LLMs are terrible. They can't reason at all. When they perform well on benchmarks its because every benchmark contains questions that are in the LLMs training data. If you burn trillions of dollars and have nothing to show, you lie so people keep giving you money.

    https://arxiv.org/html/2502.14318

    However, the extent of this progress is frequently exaggerated based on appeals to rapid increases in performance on various benchmarks. I have argued that these benchmarks are of limited value for measuring LLM progress because of problems of models being over-fit to the benchmarks, lack real-world relevance of test items, and inadequate validation for whether the benchmarks predict general cognitive performance. Conversely, evidence from adversarial tasks and interpretability research indicates that LLMs consistently fail to learn the underlying structure of the tasks they are trained on, instead relying on complex statistical associations and heuristics which enable good performance on test benchmarks but generalise poorly to many real-world tasks.

    1 Reply Last reply
    2
    • F [email protected]

      It checked out. But, all six getting the same is likely incorrect?.

      E This user is from outside of this forum
      E This user is from outside of this forum
      [email protected]
      wrote last edited by
      #37

      If all 6 got the same answer multiple times, then that means that your query very strongly correlated with that reply in the training data used by all of them. Does that mean it's therefore correct? Well, no. It could mean that there were a bunch of incorrect examples of your query they used to come up with that answer. It could mean that the examples it's working from seem to follow a pattern that your problem fits into, but the correct answer doesn't actually fit that seemingly obvious pattern. And yes, there's a decent chance it could actually be correct. The problem is that the only way to eliminate those other still also likely possibilities is to actually do the problem, at which point asking the LLM accomplished nothing.

      F 1 Reply Last reply
      0
      • F [email protected]

        I'm a little confused after listening to a podcast with.... Damn I can't remember his name. He's English. They call him the godfather of AI. A pioneer.

        Well, he believes that gpt 2-4 were major breakthroughs in artificial infection. He specifically said chat gpt is intelligent. That some type of reasoning is taking place. The end of humanity could come in a year to 50 years away. If the fella who imagined a Neural net that is mapped using the human brain. And this man says it is doing much more. Who should I listen too?. He didn't say hidden AI. HE SAID CHAT GPT. HONESTLY ON OFFENSE. I JUST DON'T UNDERSTAND THIS EPIC SCENARIO ON ONE SIDE AND TOTALLY NOTHING ON THE OTHER

        rhaedas@fedia.ioR This user is from outside of this forum
        rhaedas@fedia.ioR This user is from outside of this forum
        [email protected]
        wrote last edited by
        #38

        One step might be to try and understand the basic principles behind what makes a LLM function. The Youtube channel 3blue1brown has at least one good video on transformers and how they work, and perhaps that will help you understand that "reasoning" is a very broad term that doesn't necessarily mean thinking. What is going on inside a LLM is fascinating and amazing in what does manage to come out that's useful, but like any tool it can't be used for everything well, if at all.

        1 Reply Last reply
        1
        • F This user is from outside of this forum
          F This user is from outside of this forum
          [email protected]
          wrote last edited by
          #39

          I'll ask AI what's really going on lolool.

          rhaedas@fedia.ioR 1 Reply Last reply
          0
          • E [email protected]

            If all 6 got the same answer multiple times, then that means that your query very strongly correlated with that reply in the training data used by all of them. Does that mean it's therefore correct? Well, no. It could mean that there were a bunch of incorrect examples of your query they used to come up with that answer. It could mean that the examples it's working from seem to follow a pattern that your problem fits into, but the correct answer doesn't actually fit that seemingly obvious pattern. And yes, there's a decent chance it could actually be correct. The problem is that the only way to eliminate those other still also likely possibilities is to actually do the problem, at which point asking the LLM accomplished nothing.

            F This user is from outside of this forum
            F This user is from outside of this forum
            [email protected]
            wrote last edited by
            #40

            I think the best thing at this juncture is to ask an LLM WHAT THE TRUTH IS LOL

            1 Reply Last reply
            1
            • F [email protected]

              I'll ask AI what's really going on lolool.

              rhaedas@fedia.ioR This user is from outside of this forum
              rhaedas@fedia.ioR This user is from outside of this forum
              [email protected]
              wrote last edited by
              #41

              Funny, but also not a bad idea, as you can ask it to clarify on things as you go. I just reference that YT channel because he has a great ability to visually show things to help them make sense.

              1 Reply Last reply
              0
              • F [email protected]

                Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                msmc101@lemmy.blahaj.zoneM This user is from outside of this forum
                msmc101@lemmy.blahaj.zoneM This user is from outside of this forum
                [email protected]
                wrote last edited by
                #42

                no, LLM's are designed to drive up user engagement nothing else, it's programmed to present what you want to hear not actual facts. plus it's straight up not designed to do math

                1 Reply Last reply
                0
                • U [email protected]

                  The whole "two r's in strawberry" thing is enough of an argument for me. If things like that happen at such a low level, its completely impossible that it wont make mistakes with problems that are exponentially more complicated than that.

                  O This user is from outside of this forum
                  O This user is from outside of this forum
                  [email protected]
                  wrote last edited by
                  #43

                  The problem with that is that it isn't actually counting the R's.

                  You'd probably have better luck asking it to write a script for you that returns the number of instances of a letter in a string of text, then getting it to explain to you how to get it running and how it works. You'd get the answer that way, and also then have a script that could count almost any character and text of almost any size.

                  That's much more complicated, impressive, and useful, imo.

                  1 Reply Last reply
                  6
                  • M [email protected]

                    LLMs don't and can't do math. They don't calculate anything, that's just not how they work. Instead, they do this:

                    2 + 2 = ? What comes after that? Oh, I remember! It's '4'!

                    It could be right, it could be wrong. If there's enough pattern in the training data, it could remember the correct answer. Otherwise it'll just place a plausible looking value there (behavior known as AI hallucination). So, you can not "trust" it.

                    M This user is from outside of this forum
                    M This user is from outside of this forum
                    [email protected]
                    wrote last edited by
                    #44

                    Every LLM answer is a hallucination.

                    C 1 Reply Last reply
                    6
                    • F [email protected]

                      Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                      M This user is from outside of this forum
                      M This user is from outside of this forum
                      [email protected]
                      wrote last edited by
                      #45

                      L-L-Mentalist!

                      1 Reply Last reply
                      0
                      • F [email protected]

                        Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                        C This user is from outside of this forum
                        C This user is from outside of this forum
                        [email protected]
                        wrote last edited by
                        #46

                        Maybe? I'd be looking all over for some convergent way to fuck it up, though.

                        If it's just one model or the answers are only close, lol no.

                        1 Reply Last reply
                        1
                        • M [email protected]

                          Every LLM answer is a hallucination.

                          C This user is from outside of this forum
                          C This user is from outside of this forum
                          [email protected]
                          wrote last edited by
                          #47

                          Some are just realistic to the point of being correct. It frightens me how many users have no idea about any of that.

                          1 Reply Last reply
                          5
                          • S [email protected]

                            short answer: no.

                            Long Answer: They are still (mostly) statisics based and can't do real math. You can use the answers from LLMs as starting point, but you have to rigerously verify the answers they give.

                            confuser@lemmy.zipC This user is from outside of this forum
                            confuser@lemmy.zipC This user is from outside of this forum
                            [email protected]
                            wrote last edited by
                            #48

                            A calculator as a tool to a llm though, that works, at least mostly, and could be better when kinks get worked out.

                            1 Reply Last reply
                            1
                            • G This user is from outside of this forum
                              G This user is from outside of this forum
                              [email protected]
                              wrote last edited by
                              #49

                              Finally an intelligent comment. So many comments in here that don't realize most LLM's are bundled with calculators that just do the math.

                              facedeer@fedia.ioF 1 Reply Last reply
                              1
                              • F [email protected]

                                Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                                Q This user is from outside of this forum
                                Q This user is from outside of this forum
                                [email protected]
                                wrote last edited by
                                #50

                                Most LLM's now call functions in the background. Most calculations are just simple Python expressions.

                                F 1 Reply Last reply
                                3
                                • F [email protected]

                                  Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                                  deathbybigsad@sh.itjust.worksD This user is from outside of this forum
                                  deathbybigsad@sh.itjust.worksD This user is from outside of this forum
                                  [email protected]
                                  wrote last edited by
                                  #51

                                  Yes, with absolute certainty.

                                  For example: 2 + 2 = 5

                                  It's absolutely correct and if you dispute it, big bro is gonna have to re-educated you on that.

                                  F 1 Reply Last reply
                                  3
                                  • deathbybigsad@sh.itjust.worksD [email protected]

                                    Yes, with absolute certainty.

                                    For example: 2 + 2 = 5

                                    It's absolutely correct and if you dispute it, big bro is gonna have to re-educated you on that.

                                    F This user is from outside of this forum
                                    F This user is from outside of this forum
                                    [email protected]
                                    wrote last edited by
                                    #52

                                    I NEED TO consult every LLM VIA TELEKINESIS QUANTUM ELECTRIC GRAVITY A AND B WAVE.

                                    1 Reply Last reply
                                    0
                                    • Q [email protected]

                                      Most LLM's now call functions in the background. Most calculations are just simple Python expressions.

                                      F This user is from outside of this forum
                                      F This user is from outside of this forum
                                      [email protected]
                                      wrote last edited by
                                      #53

                                      Yes. I was aware of that, but I was manipulated by an analog device

                                      1 Reply Last reply
                                      0
                                      • F [email protected]

                                        Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                                        indigomoontrue@lemmy.worldI This user is from outside of this forum
                                        indigomoontrue@lemmy.worldI This user is from outside of this forum
                                        [email protected]
                                        wrote last edited by
                                        #54

                                        I test my local LLM for the first few weeks. Every "answer" it gives me i still look up elsewhere to confirm. Then when i see it is accurate, I still will check every 10 or so questions just to make sure. Unfortunately, I feel like they are making search engines worst on purpose so that the A.i. or in this case server or local LLMs can replace them. This is the sweet spot. I wouldn't advise getting any newer LLM's that will come out in the next few months (next generation).

                                        1 Reply Last reply
                                        0
                                        • F [email protected]

                                          Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

                                          typewar@infosec.pubT This user is from outside of this forum
                                          typewar@infosec.pubT This user is from outside of this forum
                                          [email protected]
                                          wrote last edited by
                                          #55

                                          No because there is randomness involved

                                          1 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups