Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

agnos.is Forums

  1. Home
  2. Programmer Humor
  3. APIs vs Web Scrapers

APIs vs Web Scrapers

Scheduled Pinned Locked Moved Programmer Humor
programmerhumor
11 Posts 10 Posters 1 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • cm0002@lemmy.worldC [email protected]
    This post did not contain any content.
    tropicaldingdong@lemmy.worldT This user is from outside of this forum
    tropicaldingdong@lemmy.worldT This user is from outside of this forum
    [email protected]
    wrote on last edited by
    #2

    beautiful soup

    1 Reply Last reply
    4
    • cm0002@lemmy.worldC [email protected]
      This post did not contain any content.
      H This user is from outside of this forum
      H This user is from outside of this forum
      [email protected]
      wrote on last edited by
      #3

      As long as the scrapers follows robots.txt

      J 1 Reply Last reply
      7
      • H [email protected]

        As long as the scrapers follows robots.txt

        J This user is from outside of this forum
        J This user is from outside of this forum
        [email protected]
        wrote on last edited by
        #4

        It's equivalent to "the code."

        K D 2 Replies Last reply
        4
        • cm0002@lemmy.worldC [email protected]
          This post did not contain any content.
          M This user is from outside of this forum
          M This user is from outside of this forum
          [email protected]
          wrote on last edited by
          #5

          I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

          I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

          erytau@programming.devE 1 Reply Last reply
          6
          • J [email protected]

            It's equivalent to "the code."

            K This user is from outside of this forum
            K This user is from outside of this forum
            [email protected]
            wrote on last edited by
            #6

            1 Reply Last reply
            4
            • cm0002@lemmy.worldC [email protected]
              This post did not contain any content.
              S This user is from outside of this forum
              S This user is from outside of this forum
              [email protected]
              wrote on last edited by
              #7

              Love me some Scrapy spiders

              1 Reply Last reply
              0
              • M [email protected]

                I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

                I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

                erytau@programming.devE This user is from outside of this forum
                erytau@programming.devE This user is from outside of this forum
                [email protected]
                wrote on last edited by
                #8

                Fourth panel as well, with those bots collecting data for AI training that don't respect your robots.txt, change user agents and overload your servers

                D 1 Reply Last reply
                2
                • cm0002@lemmy.worldC [email protected]
                  This post did not contain any content.
                  kojichan@lemmy.worldK This user is from outside of this forum
                  kojichan@lemmy.worldK This user is from outside of this forum
                  [email protected]
                  wrote on last edited by
                  #9

                  I just recently seen a python scraper in my server logs earlier today. Strangest thing to see.

                  1 Reply Last reply
                  1
                  • J [email protected]

                    It's equivalent to "the code."

                    D This user is from outside of this forum
                    D This user is from outside of this forum
                    [email protected]
                    wrote on last edited by
                    #10

                    It really should be "parlay.txt".

                    1 Reply Last reply
                    1
                    • erytau@programming.devE [email protected]

                      Fourth panel as well, with those bots collecting data for AI training that don't respect your robots.txt, change user agents and overload your servers

                      D This user is from outside of this forum
                      D This user is from outside of this forum
                      [email protected]
                      wrote on last edited by
                      #11

                      War boys from Fury Road?

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups