The Open-Source Software Saving the Internet From AI Bot Scrapers
-
Non paywalled link https://archive.is/VcoE1
It basically boils down to making the browser do some cpu heavy calculations before allowing access. This is no problem for a single user, but for a bot farm this would increase the amount of compute power they need 100x or more.
It inherently blocks a lot of the simpler bots by requiring JavaScript as well.
-
adding to this, some sites set the difficulty way higher then others, nerdvpn's invidious and redlib instances take about 5 seconds and some ~20k hashes, while privacyredirect's inatances are almost instant with less then 50 hashes each time
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
-
To be honest, I need to ask my admin about that!
wrote last edited by [email protected]We don't use anubis but we use iocaine (?), see /0 for the announcement post
-
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
Well, it's the scrapers that are causing the problem.
-
It’s not always about being first but about marketing.
And one has a cute catgirl mascot, the other a website that looks like a blockchain techbro startup.
I'm even willing to bet the amount of people that set up Anubis just to get the cute splash screen isn't insignificant.Compare and contrast.
High-performance traffic management and next-gen security with multi-cloud management and observability. Built for the enterprise — open source at heart.
Sounds like some over priced, vacuous, do-everything solution. Looks and sounds like every other tech website. Looks like it is meant to appeal to the people who still say "cyber". Looks and sounds like fauxpen source.
Weigh the soul of incoming HTTP requests to protect your website!
Cute. Adorable. Baby girl. Protect my website. Looks fun. Has one clear goal.
-
Exactly. It's called proof-of-work and was originally invented to reduce spam emails but was later used by Bitcoin to control its growth speed
It's funby that older captchas could be viewed as proof of work algorithms now because image recognition is so good. (From using captchas.)
-
This post did not contain any content.
I know people love anime myself included, but this popping up on my work PC can be frustrating
-
I know people love anime myself included, but this popping up on my work PC can be frustrating
Contact the administrator to ask them to change the landing page
-
This post did not contain any content.
Fantastic article! Makes me less afraid to host a website with this potential solution
-
All this could be avoided by making submit photo id to login into a account.
I don't think this would help:
-
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
i mean, kinda?
you are absolutely right that someone with an old pc might need to wait a few extra seconds, but the speed is ultimately throttled by the browser -
"You criticize society yet you participate in it. Curious."
If you think thats comparable then you're dumber than I thought.
-
I don't think this would help:
By photo ID, I don't mean just any photo, I mean "photo id" cryptographically signed by the state, certificates checked, database pinged, identity validated, the whole enchilada
-
"Yes", for any bits the user sees. The frontend UI can be behind Anubis without issues. The API, including both user and federation, cannot. We expect "bots" to use an API, so you can't put human verification in front of it. These "bots* also include applications that aren't aware of Anubis, or unable to pass it, like all third party Lemmy apps.
That does stop almost all generic AI scraping, though it does not prevent targeted abuse.
The API, including both user and federation, cannot.
This is theoretically an issue however in practice Anubis only weighs requests that appear to come from a browser: https://anubis.techaro.lol/docs/design/how-anubis-works
I just tested my instance with Jerboa and it seems to work just fine.
-
It's funby that older captchas could be viewed as proof of work algorithms now because image recognition is so good. (From using captchas.)
wrote last edited by [email protected]Interesting stance. I have bought many tens of thousand of captcha soves for legitimate reasons, and I have now completely lost faith in them
-
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
Isn't that just the way things work in general though? If you have a worse computer, everything is going to be slower, broadly speaking.
-
I've seen this pop up on websites a lot lately. Usually it takes a few seconds to load the website but there have been occasions where it seemed to hang as it was stuck on that screen for minutes and I ended up closing my browser tab because the website just wouldn't load.
Is this a (known) issue or is it intended to be like this?
I have had a similar experience. Most sites with Anubis take only a few seconds to go through, but I ran into I think it was some small blog where it took at least 5 minutes. Like someone mentioned, it may have been how they set it up with number of hashes required. The site that took forever for me seemed to have some exorbitant number like 5k or 50k (I don't recall exactly).
-
I don't understand how/why this got so popular out of nowhere... the same solution has already existed for years in the form of haproxy-protection and a couple others... but nobody seems to care about those.
Probably a similar reason as to why we don't hear about the other potential hundreds of competing products or solutions to the same problem (in general).
Luck.
It's just not fair in our world.
-
So they make the internet worse for poor people? I could get through 20k in a second, but someone with just an old laptop would take a few minutes, no?
Just wait till they hit my homepage with a 200mb react frontend, 9 seperate tracking / analytics scripts and generic shopify scripts on it
-
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
I don't like it either because my prefered way to use the web is either through the terminal or a very stripped down browser. I HATE tracking and JS