How to combat large amounts of Ai scrapers

[email protected]

You probably don't need me to tell you, but keep good backups. Friend of mine recently had his account nuked without any reason given, and without the possibility of recourse.

a mail from Oracle, informing about the immediate termination of service, and deletion of all data

[email protected]

I wonder why that RoboNope doesn't just make a fail2ban entry for anything that accesses a disallowed url and drop them entirely.

Actually this look like it would do something similiar, then dumps them to fail2ban after the re-access the honeypot page too many times: https://petermolnar.net/article/anti-ai-nepenthes-fail2ban/

[email protected]

Eso es correctísimo. I don't want ANY AI in my servers looking for anything, regardless of if they are crawlers or if it's on behalf of some lazy fuck.

[email protected]

this does not really apply because i run some frontends so there is not really any information that ai needs

[email protected]

Anubis has this built in if it detects bots it turns the diffuclty to impossible

[email protected]

bunkerweb looks intresting

[email protected]

as I heard that's pretty common at oracle, but it's good to spread the word

[email protected]

So, what I'm reading is, if your "users" are bad (or bots), just get better users.

Sounds like a net win.

[email protected]

What's bothering you?

Is it to give out data for AI training? I guess you can't fundamentally protect against this, except by limiting how much content is provided to each address.
Or is it the resource strain that it causes on your server? In that case i recommend limiting how much a single client / IP address can request in a day.

[email protected]

does anubis not work?

[email protected]

i can only get it to protect one container. i have 3 that i need protected and i cant figure out how to run more then one instance of it.

[email protected]

its the strain of it i mostly run instances and frontends so the training is not a huge problem

[email protected]

ill check robonope out seems promising

[email protected]

ive been using Anubis my only issue is i would have to run more then one instance and i dont like cloudflare personaly

[email protected]

the keyword you need is "DDoS protection" i guess

it keeps the server from getting overloaded due to too many requests

agnos.is Forums

How to combat large amounts of Ai scrapers