Anubis - Weighs the soul of incoming HTTP requests using proof-of-work to stop AI crawlers
-
It is isn't on many levels.
-
It only runs against the Firefox user agent. This is not great as the user agent can easy be changed. It may work now but tomorrow that could all change.
-
It doesn't measure load so even if your website has only a few people accessing it they will stick have to do the proof of work.
-
The POW algorithm is not well designed and requires a lot of compute on the server which means that it could be used as a denial of service attack vector. It also uses sha256 which isn't optimized for a proof of work type calculation and can be brute forced pretty easily with hardware.
In summary the Tor implementation is a lot better. I would love to see someone port it to the clearnet.
I look forward to TOR's PoW coming out for FOSS WAFs
-
-
I not find any instruction on the source page on how to actually deploy this. That would be a nice touch imho.
There are some detailed instructions on the docs site, tho I agree it'd be nice to have in the readme, too.
Sounds like the dev was not expecting this much interest for the project out of nowhere so there will def be gaps.
-
I just started using this myself, seems pretty great so far!
Clearly doesn't stop all AI crawlers, but a significantly large chunk of them.
I think the maze approach is better, this seems like it hurts valid users if the web more than a company would be.
-
...Why? It's just telling companies they can get support + white-labeling for a fee, and asking you keep their silly little character in a tongue-and-cheek manner.
Just like they say, you can modify the code and remove for free if you really want, they're not forbidding you from doing so or anythingJust like they say, you can modify the code and remove for free if you really want, they’re not forbidding you from doing so or anything
True, but I think you are discounting the risk that the actual god Anubis will take displeasure at such an act, potentially dooming one's real life soul.
-
I just started using this myself, seems pretty great so far!
Clearly doesn't stop all AI crawlers, but a significantly large chunk of them.
Nice. Crypto miners disguised as anti-AI.
-
Nice. Crypto miners disguised as anti-AI.
what about this is crypto mining?
-
I just started using this myself, seems pretty great so far!
Clearly doesn't stop all AI crawlers, but a significantly large chunk of them.
-
It's a clever solution but I did see one recently that IMO was more elegant for noscript users. I can't remember the name but it would create a dummy link that human users won't touch, but webcrawlers will naturally navigate into, but then generates an infinitely deep tree of super basic HTML to force bots into endlessly trawling a cheap-to-serve portion of your webserver instead of something heavier.
generates an infinitely deep tree
Wouldn't the bot simply limit the depth of it's seek?
-
I just started using this myself, seems pretty great so far!
Clearly doesn't stop all AI crawlers, but a significantly large chunk of them.
Why Sha256? Literally every processor has a crypto accelerator and will easily pass. And datacenter servers have beefy server CPUs. This is only effective against no-JS scrapers.
-
I think the maze approach is better, this seems like it hurts valid users if the web more than a company would be.
For those not aware, nepenthese is an example for the above mentioned approach !
-
I not find any instruction on the source page on how to actually deploy this. That would be a nice touch imho.
Or even a quick link to the relevant portion of the docs at least would be cool
-
generates an infinitely deep tree
Wouldn't the bot simply limit the depth of it's seek?
It could be infinitely wide too if they desired. It shouldn't be that hard to do I wouldn't think. I would suspect they limit the time a chain can use though to eventually escape out, though this still protects data because it obfuscates legitimate data that it wants. The goal isn't to trap them forever. It's to keep them from getting anything useful.
-
Why Sha256? Literally every processor has a crypto accelerator and will easily pass. And datacenter servers have beefy server CPUs. This is only effective against no-JS scrapers.
It requires a bunch of browser features that non-user browsers don't have, and the proof-of-work part is like the least relevant piece in this that only gets invoked once a week or so to generate a unique cookie.
I sometimes have the feeling that as soon as some crypto-currency related features are mentioned people shut off part of their brain. Either because they hate crypto-currencies or because crypto-currency scammers have trained them to only look at some technical implementatiin details and fail to see the larger picture that they are being scammed.
-
It is isn't on many levels.
-
It only runs against the Firefox user agent. This is not great as the user agent can easy be changed. It may work now but tomorrow that could all change.
-
It doesn't measure load so even if your website has only a few people accessing it they will stick have to do the proof of work.
-
The POW algorithm is not well designed and requires a lot of compute on the server which means that it could be used as a denial of service attack vector. It also uses sha256 which isn't optimized for a proof of work type calculation and can be brute forced pretty easily with hardware.
In summary the Tor implementation is a lot better. I would love to see someone port it to the clearnet.
I use https://sx.catgirl.cloud/ so I'm already primed to have anime catgirls protecting my webs.
-
-
I use https://sx.catgirl.cloud/ so I'm already primed to have anime catgirls protecting my webs.
Catgirls, jackalgirls, all embarrassing. Go full-on furry.
-
For those not aware, nepenthese is an example for the above mentioned approach !
This looks like it can can actually fuck up some models, but the unnecessary CPU load it will generate means most websites won't use it unfortunately
-
What's the ffxiv reference here?
Anubis is from Egyptian mythology.
-
What's the ffxiv reference here?
Anubis is from Egyptian mythology.
The names of release versions are famous FFXIV Garleans
-
It's a clever solution but I did see one recently that IMO was more elegant for noscript users. I can't remember the name but it would create a dummy link that human users won't touch, but webcrawlers will naturally navigate into, but then generates an infinitely deep tree of super basic HTML to force bots into endlessly trawling a cheap-to-serve portion of your webserver instead of something heavier.
That's a tarpit that you're describing, like iocaine or nepthasis. Those are to feed the crawler junk data to try and make their eventual output bad.
Anubis tries to not let the AI crawlers in at all.
-
It requires a bunch of browser features that non-user browsers don't have, and the proof-of-work part is like the least relevant piece in this that only gets invoked once a week or so to generate a unique cookie.
I sometimes have the feeling that as soon as some crypto-currency related features are mentioned people shut off part of their brain. Either because they hate crypto-currencies or because crypto-currency scammers have trained them to only look at some technical implementatiin details and fail to see the larger picture that they are being scammed.
So if you try to access a website using this technology via terminal, what happens? The connection fails?