Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.

[email protected]

Damned ~~Arasaka~~Cloudflare ice walls are such a pain

[email protected]

the only problem with that solution being applied to generic websites is schools and institutions can have many legitimate users from one IP address and many sites don't want a chance to accidentally block one.

[email protected]

It’s difficult to imagine a group of people voluntarily amassing and then using the resources necessary for “AI” absent the desire to cash in on their investment.

No imagination necessary.

I mean Dmitry Pospelov was arguing for AI control in the Soviet Union clear back in the 70s.

[email protected]

everyone remembers tomogatchi, they were like a digital houseplant.

[email protected]

It's worked alright for me. Your mileage may vary.

If someone is scraping my site at a low crawl rate I honestly don't care so long as it doesn't impact my performance for everyone else. If I hosted anything that was not just public knowledge or copy regurgitated verbatim from the bumf provided by the vendors of the brands I sell, I might oppose to it ideologically. But I don't. So I don't.

If parallel crawling from multiple organizations legitimately becomes a concern for us I will have to get more creative. But thus far it hasn't, and honestly just wholesale blocking Amazon from our shit instantly solved 90% of the problem.

[email protected]

This is fair in those applications. I only run an ecommerce web site, though, so that doesn't come into play.

? Offline

given what domains we're hosted on; i think we've both had a version of this conversation about a thousand times, and both ended up where we ended up. do you want us to explain hypothetically-at-but-mostly-past each other again? I can do it while un-sober, if you like.

? Offline

cool, but where do you get them? you can't, right? because they were stupid?

[email protected]

You probably just should let an AI generate that.

[email protected]

Will it actually allow ordinary users to browse normally, though? Their other stuff breaks in minority browsers. Have they tested this well enough so that it won't? (I'd bet not.)

[email protected]

Cloudflare kind of real for this. I love it.

It makes perfect sense for them as a business, infinite automated traffic equals infinite costs and lower server stability, but at the same time how often do giant tech companies do things that make sense these days?

[email protected]

All the while each AI costs more power than a million human beings to run, and the world burns down around us.

[email protected]

They aren't poisoning the data with disinformation.

They're poisoning it with accurate, but irrelevant information.

For example, if a bot is crawling sites relating to computer programming, or weather, this tool might lure the crawler into pages related to animal facts, or human biology.

[email protected]

LLMs tend to be really bad at detecting AI generated content. I can’t imagine specialized models are much better. For the crawler, it’s also exponentially more expensive.

[email protected]

The energy cost of inference is overstated. Small models, or “sparse” models like Deepseek are not that expensive to run. Training is a one-time cost that still pales in comparison to industrial processes.

Basically, only Altman wants it to be cost prohibitive so he can have a monopoly. Also, he’s full of shit.

[email protected]

So the web is a corporate war zone now and you can choose feudal protection or being attacked from all sides. What a time to be alive.

[email protected]

the used market

[email protected]

IP verification is a not uncommon method for commercial crawlers

[email protected]

Not who you responded to but yeah I want to hear a drug fuelled rant I don't even care what topic

[email protected]

I'd assume they're using aria tags to hide the links from screen readers, at least that's what the article seems to imply.

agnos.is Forums

Cloudflare announces AI Labyrinth, which uses AI-generated content to confuse and waste the resources of AI Crawlers and bots that ignore “no crawl” directives.