Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Open Source
  3. The Open-Source Software Saving the Internet From AI Bot Scrapers

The Open-Source Software Saving the Internet From AI Bot Scrapers

Scheduled Pinned Locked Moved Open Source
opensource
102 Posts 65 Posters 1 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H [email protected]

    I don't think this would help:

    https://thispersondoesnotexist.com/

    I This user is from outside of this forum
    I This user is from outside of this forum
    [email protected]
    wrote last edited by
    #93

    By photo ID, I don't mean just any photo, I mean "photo id" cryptographically signed by the state, certificates checked, database pinged, identity validated, the whole enchilada

    russjr08@bitforged.spaceR 1 Reply Last reply
    0
    • D [email protected]

      "Yes", for any bits the user sees. The frontend UI can be behind Anubis without issues. The API, including both user and federation, cannot. We expect "bots" to use an API, so you can't put human verification in front of it. These "bots* also include applications that aren't aware of Anubis, or unable to pass it, like all third party Lemmy apps.

      That does stop almost all generic AI scraping, though it does not prevent targeted abuse.

      B This user is from outside of this forum
      B This user is from outside of this forum
      [email protected]
      wrote last edited by
      #94

      The API, including both user and federation, cannot.

      This is theoretically an issue however in practice Anubis only weighs requests that appear to come from a browser: https://anubis.techaro.lol/docs/design/how-anubis-works

      I just tested my instance with Jerboa and it seems to work just fine.

      1 Reply Last reply
      1
      • jackbydev@programming.devJ [email protected]

        It's funby that older captchas could be viewed as proof of work algorithms now because image recognition is so good. (From using captchas.)

        mubelotix@jlai.luM This user is from outside of this forum
        mubelotix@jlai.luM This user is from outside of this forum
        [email protected]
        wrote last edited by [email protected]
        #95

        Interesting stance. I have bought many tens of thousand of captcha soves for legitimate reasons, and I have now completely lost faith in them

        1 Reply Last reply
        1
        • repletelocum@lemmy.blahaj.zoneR [email protected]

          So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?

          C This user is from outside of this forum
          C This user is from outside of this forum
          [email protected]
          wrote last edited by
          #96

          Isn't that just the way things work in general though? If you have a worse computer, everything is going to be slower, broadly speaking.

          1 Reply Last reply
          4
          • P [email protected]

            I've seen this pop up on websites a lot lately. Usually it takes a few seconds to load the website but there have been occasions where it seemed to hang as it was stuck on that screen for minutes and I ended up closing my browser tab because the website just wouldn't load.

            Is this a (known) issue or is it intended to be like this?

            C This user is from outside of this forum
            C This user is from outside of this forum
            [email protected]
            wrote last edited by
            #97

            I have had a similar experience. Most sites with Anubis take only a few seconds to go through, but I ran into I think it was some small blog where it took at least 5 minutes. Like someone mentioned, it may have been how they set it up with number of hashes required. The site that took forever for me seemed to have some exorbitant number like 5k or 50k (I don't recall exactly).

            1 Reply Last reply
            1
            • R [email protected]

              I don't understand how/why this got so popular out of nowhere... the same solution has already existed for years in the form of haproxy-protection and a couple others... but nobody seems to care about those.

              L This user is from outside of this forum
              L This user is from outside of this forum
              [email protected]
              wrote last edited by
              #98

              Probably a similar reason as to why we don't hear about the other potential hundreds of competing products or solutions to the same problem (in general).

              Luck.

              It's just not fair in our world.

              1 Reply Last reply
              0
              • repletelocum@lemmy.blahaj.zoneR [email protected]

                So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?

                M This user is from outside of this forum
                M This user is from outside of this forum
                [email protected]
                wrote last edited by
                #99

                Just wait till they hit my homepage with a 200mb react frontend, 9 seperate tracking / analytics scripts and generic shopify scripts on it 😛

                1 Reply Last reply
                1
                • K [email protected]

                  I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.

                  It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.

                  It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.

                  But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.

                  spicehoarder@lemmy.zipS This user is from outside of this forum
                  spicehoarder@lemmy.zipS This user is from outside of this forum
                  [email protected]
                  wrote last edited by
                  #100

                  I don't like it either because my prefered way to use the web is either through the terminal or a very stripped down browser. I HATE tracking and JS

                  1 Reply Last reply
                  1
                  • I [email protected]

                    By photo ID, I don't mean just any photo, I mean "photo id" cryptographically signed by the state, certificates checked, database pinged, identity validated, the whole enchilada

                    russjr08@bitforged.spaceR This user is from outside of this forum
                    russjr08@bitforged.spaceR This user is from outside of this forum
                    [email protected]
                    wrote last edited by
                    #101

                    That would have the same effect as just taking the site offline...

                    No one is giving a random site their photo ID.

                    I 1 Reply Last reply
                    0
                    • russjr08@bitforged.spaceR [email protected]

                      That would have the same effect as just taking the site offline...

                      No one is giving a random site their photo ID.

                      I This user is from outside of this forum
                      I This user is from outside of this forum
                      [email protected]
                      wrote last edited by
                      #102

                      You'd be surprised, many humans have simply no backbone, common sense nor self respect so I think they very probably would still, in large numbers. Proof is facebook and palantir.

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups