The Open-Source Software Saving the Internet From AI Bot Scrapers
-
This post did not contain any content.
-
This post did not contain any content.
This is fantastic and I appreciate that it scales well on the server side.
Ai scraping is a scourge and I would love to know the collective amount of power wasted due to the necessity of countermeasures like this and add this to the total wasted by ai.
-
This post did not contain any content.
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
-
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
That last paragraph is nothing but defeatism
-
That last paragraph is nothing but defeatism
On the contrary, I'm hoping for a solution that is better than this.
Do you disagree with any part of my assessment? How do you think Anubis will work long term?
-
This post did not contain any content.
I had seen that prompt, but never searched about it. I found it a little annoying, mostly because I didn't know what it was for, but now I won't mind. I hope more solutions are developed
-
This post did not contain any content.
Iād like to use Anubis but the strange hentai character as a mascot is not too professional
-
This post did not contain any content.
-
Iād like to use Anubis but the strange hentai character as a mascot is not too professional
It's just image files, you can remove them or replace the images with something more corporate. The author does state they'd prefer you didn't change the pictures, but the license doesn't require adhering to their personal request. I know at least 2 sites I've visited previously had Anubis running with a generic checkmark or X that replaced the mascot
-
This post did not contain any content.
<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
-
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
sheās working on a non cryptographic challenge so it taxes usersā CPUs less, and also thinking about a version that doesnāt require JavaScript
Sounds like the developer of Anubis is aware and working on these shortcomings.
Still, IMO these are minor short term issues compared to the scope of the AI problem it's addressing.
-
On the contrary, I'm hoping for a solution that is better than this.
Do you disagree with any part of my assessment? How do you think Anubis will work long term?
Anubis long term actually costs them millions and billions more in energy to run browser and more code. Either way they have to add shit to the bots which costs all the companies money.
-
<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
the scrapers ignore robots.txt. It doesn't really ban them - it just asks them not to access things, but they are programmed by assholes.
-
Iād like to use Anubis but the strange hentai character as a mascot is not too professional
i'm sure you could replace it if you really wanted to
-
<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
Robots.txt expects that the client is respecting the rules, for instance, marking that they are a scraper.
AI scrapers don't respect this trust, and thus robots.txt is meaningless.
-
<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
The problem is Ai doesn't follow robots.txt,so Cloudflare are Anubis developed a solution.
-
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won't work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
wrote last edited by [email protected]It is basically instantaneous on my 12 year old Keppler GPU Linux Box. It is substantially less impactful on the environment than AI tar pits and other deterrents. The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for. Only to then be rendered moot, because it's an open-source project that someone will just update the cryptographic algorithm for. These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.
-
This post did not contain any content.
Well, now that y'all put it that way, I think it was pretty naive from me to think that these companies, whose business model is basically theft, would honour a lousy robots.txt file...
-
Iād like to use Anubis but the strange hentai character as a mascot is not too professional
wrote last edited by [email protected]I actually really like the developer's rationale for why they use an anime character as the mascot.
The whole blog post is worth reading, but the TL;DR is this:
Of course, nothing is stopping you from forking the software to replace the art assets. Instead of doing that, I would rather you support the project and purchase a license for the commercial variant of Anubis named BotStopper. Doing this will make sure that the project is sustainable and that I don't burn myself out to a crisp in the process of keeping small internet websites open to the public.
At some level, I use the presence of the Anubis mascot as a "shopping cart test". If you either pay me for the unbranded version or leave the character intact, I'm going to take any bug reports more seriously. It's a positive sign that you are willing to invest in the project's success and help make sure that people developing vital infrastructure are not neglected.
-
<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
Well, now that y'all put it that way, I think it was pretty naive from me to think that these companies, whose business model is basically theft, would honour a lousy robots.txt file...