AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt
-
[email protected]replied to [email protected] last edited by
It's already permanent and nonstop. They're known to ignore robots.txt, and remove user agent on detection.
And the goal is not only to prevent resource abuse, but break a predatory model.
But, feel free to continue gracefully doing nothing while other takes action, it's bound to help eventually.
-
[email protected]replied to [email protected] last edited by
room-temperature
You need to add “IQ”
-
[email protected]replied to [email protected] last edited by
The way I understand it, the hard limit to leave the domain is actually the only one of these rules that would trigger on Nepenthes. The tar pit keeps generating new linked pages full of trash.
-
[email protected]replied to [email protected] last edited by
Your definition of organic traffic is off-standard.
Fair.
The VAST majority of the web would have almost no traffic without web searches. It’s not like people flock to sites from talking about it around the water cooler.
Which is a shame, tbh. We had far better content, when people had to work to create good content, that others wanted, and got passed around.
ie, in school, before search engines, we all knew about Whitehouse.com... We all knew the sites that had the info we wanted/needed at the time.
In fact, I'd argue the downfall of the web as an actual useful tool came about once search engines automatically started indexing, rather than submitting site maps to a page like OpenDirectory to have your site cataloged, indexed, and sorted into appropriate categories by a human.
Because once people started working on "gaming algos" rather than "Making super good content", the internet just became the new "Malls" where you weren't expected to learn, you were just expected to buy.
-
[email protected]replied to [email protected] last edited by
Clearly more than one or two admins are interested in these options I don’t know why you are assuming that’s the whole list of interested people. Not everyone is as eager as you to roll over and take it without protest.
-
[email protected]replied to [email protected] last edited by
Hey, you keep fighting the good fight, you’ve got them on the ropes! You and all your many, many friends!
-
[email protected]replied to [email protected] last edited by
Hey, you don’t need to convince me, you’ve clearly already committed to bravely sacrificing your own time and money in this valiant fight. Go get ‘em, tiger! I look forward to the articles about AI being stopped coming out any day now.
-
[email protected]replied to [email protected] last edited by
manual and builds are here: https://zadzmo.org/code/nepenthes/
-
[email protected]replied to [email protected] last edited by
what is your deal?
-
[email protected]replied to [email protected] last edited by
I liked it back when link aggregators were the go-to for discovery. You could have sites that were real gems that were just tucked away.
I think the indexing started out ok. Counting backlinks and using that as a ranking was pretty genius, right up until people realized they could game the system, then google realized that artificially screwing with their own system was worth money, then the used ads to modify ranking.
ads to modify discoverability the death of free internet
-
[email protected]replied to [email protected] last edited by
The only AI company that responded to Ars' request to comment was OpenAI, whose spokesperson confirmed that OpenAI is already working on a way to fight tarpitting.
Ah yes. It extremely common for one of the top companies in an industry to spitefully expend resources fighting the efforts of...
One or two people
Please, continue to grace us with you unbiased wisdom. Clearly you've read the article and aren't just trying to simp for AI or start flame wars like a petulant child.
-
[email protected]replied to [email protected] last edited by
Not like you can load balance requests of the malicious subdirectories to a non-prod hardware. Can be decommissioned hardware.
-
[email protected]replied to [email protected] last edited by
Well, luckily for them, it's a pretty simple fix. Congrats on being a part of making them jot down a note to prevent tarpitting when they get around to it. You've saved the internet!
And stop pretending like you're unbiased either. We both have our preconceived notions, and you're not more likely to be open to change yours than I am. In fact, given the hysterical hyperventilating anti-AI "activists" get to, we both know you're not ever going to change your mind on AI, and as such you'll glom onto any small action you think is gonna stick it to the man, no matter whether that action is going to have any practical effect on the push for AI or not.
-
[email protected]replied to [email protected] last edited by
I'm just vibing, watching the hysterics you guys get up to.
-
[email protected]replied to [email protected] last edited by
No you’re being a petulant, naysaying child. Leave us alone and go play with your duplos. Adults are talking.
-
[email protected]replied to [email protected] last edited by
How many hobby website admins have load balancing for their small sites? How many have decommissioned hardware? Because if you find me a corporation wiling to accept the liability doing something like this could open them up to, I'll pay you a million dollars.
-
[email protected]replied to [email protected] last edited by
Bigotry? From a lemmy user? Never seen it before!
If you don't like what I'm saying, block me and move along. Or report my comments, if you think they're offensive enough. If I'm breaking a rule or the mods don't like what I have to say, maybe they'll remove them, or even ban me from the comm! That's the limit of your options for getting rid of me though.
-
[email protected]replied to [email protected] last edited by
Interesting. Mega supporters are now cold-blooded.
-
[email protected]replied to [email protected] last edited by
Bigotry lmao talk about a Hail Mary.
-
[email protected]replied to [email protected] last edited by
I get that the Internet doesn't contain an infinite number of domains. Max visits to a each one can be limited. Hel-lo, McFly?