Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Ask Lemmy
  3. Is there a way to reduce the number of AI generated websites that appear in search results?

Is there a way to reduce the number of AI generated websites that appear in search results?

Scheduled Pinned Locked Moved Ask Lemmy
asklemmy
41 Posts 31 Posters 2 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C This user is from outside of this forum
    C This user is from outside of this forum
    [email protected]
    wrote last edited by
    #1

    I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

    Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

    I R C F tropicaldingdong@lemmy.worldT 15 Replies Last reply
    108
    • C [email protected]

      I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

      Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

      I This user is from outside of this forum
      I This user is from outside of this forum
      [email protected]
      wrote last edited by
      #2

      Use a search engine that doesn't do bullshit LLM stuff. Kagi, for example

      L 1 Reply Last reply
      8
      • C [email protected]

        I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

        Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

        R This user is from outside of this forum
        R This user is from outside of this forum
        [email protected]
        wrote last edited by
        #3

        You can add filters to unlock origin to remove AI overviews if you are using Google, or add “-ai” at the end of your search query. That, or switch to a browser which doesn’t have AI overviews.

        I think what you are describing is just typical filler that you find on recipe websites and not AI. They’re all full of unnecessary stories and links, but most will have a “jump to recipe” button somewhere so you can skip the BS.

        S C 2 Replies Last reply
        9
        • R [email protected]

          You can add filters to unlock origin to remove AI overviews if you are using Google, or add “-ai” at the end of your search query. That, or switch to a browser which doesn’t have AI overviews.

          I think what you are describing is just typical filler that you find on recipe websites and not AI. They’re all full of unnecessary stories and links, but most will have a “jump to recipe” button somewhere so you can skip the BS.

          S This user is from outside of this forum
          S This user is from outside of this forum
          [email protected]
          wrote last edited by
          #4

          The vast majority of that blogspam has been algorithmically generated long before LLMs

          R 1 Reply Last reply
          13
          • C [email protected]

            I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

            Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

            C This user is from outside of this forum
            C This user is from outside of this forum
            [email protected]
            wrote last edited by
            #5

            Update: for now it seems duck duck go’s date range filter is kinda a magic bullet for this type of thing. Set the range between 2010 and 2020 and the top results for a lot of temporally agnostic searches.

            C B 2 Replies Last reply
            57
            • S [email protected]

              The vast majority of that blogspam has been algorithmically generated long before LLMs

              R This user is from outside of this forum
              R This user is from outside of this forum
              [email protected]
              wrote last edited by
              #6

              Mechanical turk even before that which frankly wasn't much better.

              1 Reply Last reply
              3
              • R [email protected]

                You can add filters to unlock origin to remove AI overviews if you are using Google, or add “-ai” at the end of your search query. That, or switch to a browser which doesn’t have AI overviews.

                I think what you are describing is just typical filler that you find on recipe websites and not AI. They’re all full of unnecessary stories and links, but most will have a “jump to recipe” button somewhere so you can skip the BS.

                C This user is from outside of this forum
                C This user is from outside of this forum
                [email protected]
                wrote last edited by
                #7

                I’ll have to try it out ^^

                1 Reply Last reply
                0
                • C [email protected]

                  I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                  Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                  F This user is from outside of this forum
                  F This user is from outside of this forum
                  [email protected]
                  wrote last edited by
                  #8

                  I use Firefox with the udm14 extension for Google - gives me only the web results. No AI, no shopping, images, etc, only website results.

                  B 1 Reply Last reply
                  0
                  • C [email protected]

                    I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                    Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                    tropicaldingdong@lemmy.worldT This user is from outside of this forum
                    tropicaldingdong@lemmy.worldT This user is from outside of this forum
                    [email protected]
                    wrote last edited by
                    #9

                    I don't have an answer. What I can tell you is that it is BAD. I pretty much can't find useful results post 2022/23

                    jballs@sh.itjust.worksJ 1 Reply Last reply
                    10
                    • I [email protected]

                      Use a search engine that doesn't do bullshit LLM stuff. Kagi, for example

                      L This user is from outside of this forum
                      L This user is from outside of this forum
                      [email protected]
                      wrote last edited by
                      #10

                      If it were websites made with AI, why wouldn't Kagi find them just the same? Search engines just search key terms. Can't see how it would know if the term was typed by a person or a bot. That said I used SearchXNG and it wasn't bad.

                      I dave@lemmy.nzD T 3 Replies Last reply
                      7
                      • C [email protected]

                        I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                        Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                        G This user is from outside of this forum
                        G This user is from outside of this forum
                        [email protected]
                        wrote last edited by
                        #11

                        Ironically, the stupid AI overview is more convenient for some stuff now because it cuts down to the core of what you're asking and you don't have to deal with those goddamn SEO terribly written websites. I still prefer not to use it but when it's really bad and I really can't find anything useful, I will leave it on sometimes just to at least get a general answer (usually for how to-s and stuff that's easily verifiable)

                        1 Reply Last reply
                        3
                        • L [email protected]

                          If it were websites made with AI, why wouldn't Kagi find them just the same? Search engines just search key terms. Can't see how it would know if the term was typed by a person or a bot. That said I used SearchXNG and it wasn't bad.

                          I This user is from outside of this forum
                          I This user is from outside of this forum
                          [email protected]
                          wrote last edited by
                          #12

                          Ah, I guess I misunderstood the problem

                          1 Reply Last reply
                          3
                          • C [email protected]

                            I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                            Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                            C This user is from outside of this forum
                            C This user is from outside of this forum
                            [email protected]
                            wrote last edited by
                            #13

                            I've almost given up on just searching the whole internet for something. I either filter by eye for sites I trust in the results, or add a filter to the query. There are usually a handful of sites I trust on a given topic.

                            1 Reply Last reply
                            1
                            • C [email protected]

                              Update: for now it seems duck duck go’s date range filter is kinda a magic bullet for this type of thing. Set the range between 2010 and 2020 and the top results for a lot of temporally agnostic searches.

                              C This user is from outside of this forum
                              C This user is from outside of this forum
                              [email protected]
                              wrote last edited by
                              #14

                              I've switched to presearch.com long ago. No more tracking.

                              B 1 Reply Last reply
                              6
                              • C [email protected]

                                I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                                Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                                tal@lemmy.todayT This user is from outside of this forum
                                tal@lemmy.todayT This user is from outside of this forum
                                [email protected]
                                wrote last edited by [email protected]
                                #15

                                No, because there's no reliable way to distinguish AI-generated spam sites from non-AI-generated spam sites. I'll also add that I don't expect there to be one promptly forthcoming: any attempt to identify them is going to run into improved systems, and that's gonna happen even if the systems aren't explicitly intending to evade detection. If it were easy, Google would have done so years back. I can recognize some now, but the SEO spam crowd that's creating this is trying hard to pollute search engine results, and if someone implements a generalized "block" that's effective, they're going to keep looking for alternatives until they find something that gets through.

                                On Kagi, I can set the acceptable date range on results to prior to the emergence of LLMs, but that cuts out a lot of material that I want to see. For some searches, that might work, but it's not really a general solution.

                                You can manually blacklist or deprioritize sites on Kagi. Probably can either run some sort of local proxy or Greasemonkey-style plugin that would let you do so in browser on any search engine. Problem is that there are people making these sites faster than you're going to be banning them.

                                Kagi's also got a "pin" and a "raised priority" feature for a list of sites, and I suppose could whitelist some "known good" sites. Kagi's "blacklist/deprioritize/prioritize/pin" feature does not have the ability to exchange sites between users (and I imagine that there'd be some privacy issues with doing so) aside from Kagi running a "leaderboard" of the most-blacklisted/deprioritized/prioritized/pinned sites. One could probably do the "proxy" or "plugin" route as well for a variety of websites on other search engines. Any general solution would need to have some level of interchange, since requiring every individual user to maintain a "killfile" on websites is going to be impractical. It may be that the human labor involved in curation is outweighed by how cheap it is to generate new websites; not sure.

                                At some point, I assume that it may become practical to just make a conservative whitelist of "non-spam" sites that accepts that many useful websites will be excluded because we just can't validate them as not being non-spam. Probably require human curation, which is either going to need volunteer labor or a commercial service.

                                There's also a secondary problem that if you curate content at the domain level, Web 2.0 sites that permit posting content (Reddit, Wikipedia, the Threadiverse, etc) can have individual users inserting AI-generated spam. So a general solution is probably going to need to permit some sort of sub-domain level filtering for at least major sites.

                                And there's also the wrinkle that a "trusted good" site or user can become a spammer at some point. Spammers/people who want to run influence operations have been buying high-karma Reddit accounts --- and the reputation that comes with them --- for quite some years. Domains expire, or their operators change. Reputation has value, and it can be sold. So that also has to be addressed.

                                This isn't really a qualitative change. I mean, people have hand-crafted spam websites that try to grab searchers before. It's just that the ability to use a computer to do it is way more cost-efficient, brings the cost way down, and thus opens up a lot of opportunity for spam that wouldn't have made sense financially before. So what you're really aiming to do is to get the cost to make a spam website up. One possibility --- which I am absolutely confident that TLS certificate issuers would like --- would be to have tiers of TLS certificate, some of which are a lot more expensive. Search engine indexers could check and validate the TLS "cost tier" when indexing a site. That will artificially inflate the cost of running a website, and can be done to an arbitrary degree. That's not fantastic, since it also tends to cut out non-spam individual/low-cost websites, but if you're a large company somewhere, the price is basically a rounding error compared to what a spammer needs to make to make his super-cheap-to-generate LLM-generated website worthwhile. Could be a component in a system that takes into account other factors.

                                1 Reply Last reply
                                2
                                • C [email protected]

                                  I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                                  Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                                  C This user is from outside of this forum
                                  C This user is from outside of this forum
                                  [email protected]
                                  wrote last edited by
                                  #16

                                  yeah, use engines like startpage.com instead.

                                  1 Reply Last reply
                                  0
                                  • C [email protected]

                                    I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                                    Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                                    M This user is from outside of this forum
                                    M This user is from outside of this forum
                                    [email protected]
                                    wrote last edited by
                                    #17

                                    before:2023

                                    F 1 Reply Last reply
                                    2
                                    • C [email protected]

                                      I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                                      Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                                      F This user is from outside of this forum
                                      F This user is from outside of this forum
                                      [email protected]
                                      wrote last edited by
                                      #18

                                      Try uBlacklist, with these blocklists:

                                      # AI Spam
                                      https://raw.githubusercontent.com/laylavish/uBlockOrigin-HUGE-AI-Blocklist/main/list_uBlacklist.txt
                                      # Copycat Sites
                                      https://raw.githubusercontent.com/quenhus/uBlock-Origin-dev-filter/main/dist/other_format/uBlacklist/global.txt
                                      # SEO Spam & Junk
                                      https://raw.githubusercontent.com/NotaInutilis/Super-SEO-Spam-Suppressor/main/ublacklist.txt
                                      
                                      N 1 Reply Last reply
                                      31
                                      • C [email protected]

                                        I keep trying to find things like “making waffles from sour dough discard” and all the sites are the same: long meandering paragraphs full of links to other things on the site with dubious instructions.

                                        Considering at this point I can pretty much identify the type of site by looking at it; are there good extensions or search engines which might remove them from search results?

                                        felixwhynot@lemmy.worldF This user is from outside of this forum
                                        felixwhynot@lemmy.worldF This user is from outside of this forum
                                        [email protected]
                                        wrote last edited by
                                        #19

                                        https://udm14.org/

                                        1 Reply Last reply
                                        1
                                        • L [email protected]

                                          If it were websites made with AI, why wouldn't Kagi find them just the same? Search engines just search key terms. Can't see how it would know if the term was typed by a person or a bot. That said I used SearchXNG and it wasn't bad.

                                          dave@lemmy.nzD This user is from outside of this forum
                                          dave@lemmy.nzD This user is from outside of this forum
                                          [email protected]
                                          wrote last edited by
                                          #20

                                          Kagi does seem to cut out a lot of blogspam. I think Google is incentivised to send people to these sites with adwords ads on them.

                                          jh34@lemmy.worldJ 1 Reply Last reply
                                          6
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups