Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Technology
  3. Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Scheduled Pinned Locked Moved Technology
technology
210 Posts 93 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A [email protected]

    LOOK MAA I AM ON FRONT PAGE

    S This user is from outside of this forum
    S This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #70

    What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

    T 1 Reply Last reply
    18
    • A [email protected]

      LOOK MAA I AM ON FRONT PAGE

      F This user is from outside of this forum
      F This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #71

      NOOOOOOOOO

      SHIIIIIIIIIITT

      SHEEERRRLOOOOOOCK

      8 J T 3 Replies Last reply
      22
      • G [email protected]

        Most humans don't reason. They just parrot shit too. The design is very human.

        S This user is from outside of this forum
        S This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #72

        I hate this analogy. As a throwaway whimsical quip it'd be fine, but it's specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it's lowered my tolerance for it as a topic even if you did intend it flippantly.

        G 1 Reply Last reply
        8
        • F [email protected]

          NOOOOOOOOO

          SHIIIIIIIIIITT

          SHEEERRRLOOOOOOCK

          8 This user is from outside of this forum
          8 This user is from outside of this forum
          [email protected]
          wrote on last edited by
          #73

          Extept for Siri, right? Lol

          T 1 Reply Last reply
          0
          • 8 [email protected]

            Extept for Siri, right? Lol

            T This user is from outside of this forum
            T This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #74

            Apple Intelligence

            1 Reply Last reply
            0
            • jdpoz@lemmy.worldJ [email protected]

              It’s an expensive carbon spewing parrot.

              T This user is from outside of this forum
              T This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #75

              It's a very resource intensive autocomplete

              1 Reply Last reply
              5
              • I [email protected]

                Fair, but the same is true of me. I don't actually "reason"; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a "nasty logic error" pattern match at some point in the process, I "know" I've found a "flaw in the argument" or "bug in the design".

                But there's no from-first-principles method by which I developed all these patterns; it's just things that have survived the test of time when other patterns have failed me.

                I don't think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.

                C This user is from outside of this forum
                C This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #76

                This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).

                Higher reasoning is taught to humans. We have the capability. That's why we spend the first quarter of our lives in education. Sometimes not all of us are able.

                I'm sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.

                1 Reply Last reply
                0
                • A [email protected]

                  LOOK MAA I AM ON FRONT PAGE

                  M This user is from outside of this forum
                  M This user is from outside of this forum
                  [email protected]
                  wrote on last edited by [email protected]
                  #77

                  I see a lot of misunderstandings in the comments 🫤

                  This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                  Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                  zacryon@feddit.orgZ T T R K 7 Replies Last reply
                  48
                  • A [email protected]

                    LOOK MAA I AM ON FRONT PAGE

                    xatolos@reddthat.comX This user is from outside of this forum
                    xatolos@reddthat.comX This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #78

                    So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

                    Huh.

                    C 1 Reply Last reply
                    5
                    • T [email protected]

                      "Computer" meaning a mechanical/electro-mechanical/electrical machine wasn't used until around after WWII.

                      Babbag's difference/analytical engines weren't confusing because people called them a computer, they didn't.

                      "On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

                      • Charles Babbage

                      If you give any computer, human or machine, random numbers, it will not give you "correct answers".

                      It's possible Babbage lacked the social skills to detect sarcasm. We also have several high profile cases of people just trusting LLMs to file legal briefs and official government 'studies' because the LLM "said it was real".

                      A This user is from outside of this forum
                      A This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #79

                      What they mean is that before Turing, "computer" was literally a person's job description. You hand a professional a stack of calculations with some typos, part of the job is correcting those out. Newfangled machine comes along with the same name as the job, among the first thing people are gonna ask about is where it fall short.

                      Like, if I made a machine called "assistant", it'd be natural for people to point out and ask about all the things a person can do that a machine just never could.

                      T 1 Reply Last reply
                      0
                      • M [email protected]

                        I see a lot of misunderstandings in the comments 🫤

                        This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                        Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                        zacryon@feddit.orgZ This user is from outside of this forum
                        zacryon@feddit.orgZ This user is from outside of this forum
                        [email protected]
                        wrote on last edited by
                        #80

                        Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.

                        K J 2 Replies Last reply
                        7
                        • M [email protected]

                          I see a lot of misunderstandings in the comments 🫤

                          This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                          Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                          T This user is from outside of this forum
                          T This user is from outside of this forum
                          [email protected]
                          wrote on last edited by
                          #81

                          Yeah these comments have the three hallmarks of Lemmy:

                          • AI is just autocomplete mantras.
                          • Apple is always synonymous with bad and dumb.
                          • Rare pockets of really thoughtful comments.

                          Thanks for being at least the latter.

                          1 Reply Last reply
                          15
                          • C [email protected]

                            Intellegence has a very clear definition.

                            It's requires the ability to acquire knowledge, understand knowledge and use knowledge.

                            No one has been able to create an system that can understand knowledge, therefor me none of it is artificial intelligence. Each generation is merely more and more complex knowledge models. Useful in many ways but never intelligent.

                            8 This user is from outside of this forum
                            8 This user is from outside of this forum
                            [email protected]
                            wrote on last edited by
                            #82

                            Wouldn't the algorithm that creates these models in the first place fit the bill? Given that it takes a bunch of text data, and manages to organize this in such a fashion that the resulting model can combine knowledge from pieces of text, I would argue so.

                            What is understanding knowledge anyways? Wouldn't humans not fit the bill either, given that for most of our knowledge we do not know why it is the way it is, or even had rules that were - in hindsight - incorrect?

                            If a model is more capable of solving a problem than an average human being, isn't it, in its own way, some form of intelligent? And, to take things to the utter extreme, wouldn't evolution itself be intelligent, given that it causes intelligent behavior to emerge, for example, viruses adapting to external threats? What about an (iterative) optimization algorithm that finds solutions that no human would be able to find?

                            Intellegence has a very clear definition.

                            I would disagree, it is probably one of the most hard to define things out there, which has changed greatly with time, and is core to the study of philosophy. Every time a being or thing fits a definition of intelligent, the definition often altered to exclude, as has been done many times.

                            1 Reply Last reply
                            2
                            • zacryon@feddit.orgZ [email protected]

                              Some AI researchers found it obvious as well, in terms of they've suspected it and had some indications. But it's good to see more data on this to affirm this assessment.

                              K This user is from outside of this forum
                              K This user is from outside of this forum
                              [email protected]
                              wrote on last edited by [email protected]
                              #83

                              Lots of us who has done some time in search and relevancy early on knew ML was always largely breathless overhyped marketing. It was endless buzzwords and misframing from the start, but it raised our salaries. Anything that exec doesnt understand is profitable and worth doing.

                              W zacryon@feddit.orgZ 2 Replies Last reply
                              0
                              • M [email protected]

                                I see a lot of misunderstandings in the comments 🫤

                                This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                                Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                                T This user is from outside of this forum
                                T This user is from outside of this forum
                                [email protected]
                                wrote on last edited by
                                #84

                                What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to "reasoning models" that allow them to break free of the inherent boundaries of the statistical methods they are based on?

                                M 1 Reply Last reply
                                3
                                • M [email protected]

                                  I see a lot of misunderstandings in the comments 🫤

                                  This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                                  Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                                  R This user is from outside of this forum
                                  R This user is from outside of this forum
                                  [email protected]
                                  wrote on last edited by [email protected]
                                  #85

                                  What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it's no longer reasoning? I feel like at this point a more relevant question is "What exactly is reasoning?". Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.

                                  https://en.wikipedia.org/wiki/Reasoning_system

                                  M S T 3 Replies Last reply
                                  9
                                  • xthexder@l.sw0.comX [email protected]

                                    I'm not sure how you arrived at lime the mineral being a more likely question than lime the fruit. I'd expect someone asking about kidney stones would also be asking about foods that are commonly consumed.

                                    This kind of just goes to show there's multiple ways something can be interpreted. Maybe a smart human would ask for clarification, but for sure AIs today will just happily spit out the first answer that comes up. LLMs are extremely "good" at making up answers to leading questions, even if it's completely false.

                                    johnedwa@sopuli.xyzJ This user is from outside of this forum
                                    johnedwa@sopuli.xyzJ This user is from outside of this forum
                                    [email protected]
                                    wrote on last edited by [email protected]
                                    #86

                                    Making up answers is kinda their entire purpose. LMMs are fundamentally just a text generation algorithm, they are designed to produce text that looks like it could have been written by a human. Which they are amazing at, especially when you start taking into account how many paragraphs of instructions you can give them, and they tend to rather successfully follow.

                                    The one thing they can't do is verify if what they are talking about is true as it's all just slapping words together using probabilities. If they could, they would stop being LLMs and start being AGIs.

                                    1 Reply Last reply
                                    0
                                    • M [email protected]

                                      I see a lot of misunderstandings in the comments 🫤

                                      This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

                                      Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

                                      K This user is from outside of this forum
                                      K This user is from outside of this forum
                                      [email protected]
                                      wrote on last edited by
                                      #87

                                      When given explicit instructions to follow models failed because they had not seen similar instructions before.

                                      This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.

                                      M 1 Reply Last reply
                                      10
                                      • communist@lemmy.frozeninferno.xyzC [email protected]

                                        I think it's important to note (i'm not an llm I know that phrase triggers you to assume I am) that they haven't proven this as an inherent architectural issue, which I think would be the next step to the assertion.

                                        do we know that they don't and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don't? That's the big question that needs answered. It's still possible that we just haven't properly incentivized reason over memorization during training.

                                        if someone can objectively answer "no" to that, the bubble collapses.

                                        K This user is from outside of this forum
                                        K This user is from outside of this forum
                                        [email protected]
                                        wrote on last edited by
                                        #88

                                        do we know that they don't and are incapable of reasoning.

                                        "even when we provide the
                                        algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve"

                                        communist@lemmy.frozeninferno.xyzC 1 Reply Last reply
                                        1
                                        • A [email protected]

                                          LOOK MAA I AM ON FRONT PAGE

                                          H This user is from outside of this forum
                                          H This user is from outside of this forum
                                          [email protected]
                                          wrote on last edited by
                                          #89

                                          XD so, like a regular school/university student that just wants to get passing grades?

                                          1 Reply Last reply
                                          5
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups