Results 1 to 3 of 3
  1. #1

    Improving Internet Download Speeds On The Server Itself

    My partner and I run a business which has specific needs. We are not sure of the best way to meet them so we will try to describe our business model to the best of our ability and we are hoping someone will be able to suggest a service which will improve our results. Perhaps it is renting a dedicated server, or upgrading our own internet connection, we dont know.

    Our business consists of running software from a computer which is connected to the internet. The software opens a large number of Internet browsers, and controls them. Each browser navigates to the same page, and then each browser refreshes that page continuously, as quickly as the internet connection will support, scraping the screen each time to detect certain changes in the HTML. Naturally, the "faster" the internet connection, the better results we get, since each browser is able to refresh many more times per second, and so the delay between when the HTML on the target page actually changes and when one of our browsers picks up the change is minimized. That is our goal.

    We would like to know what technologies will enable us to improve the performance of this program. Please ask if anything is unclear. Thanks very much.

  2. #2
    Join Date
    Jun 2004
    Location
    Bay Area
    Posts
    1,320
    Why don't you just download the page source? That reflects each change also, plus it will be much easier to refresh and detect changes. I've never heard of actually having to render the pages in a browser window (To detect changes)?

    Anyway, the limiting issue here is probably your CPU and RAM. Each browser window takes a couple of MB ram, and the content of the website does also. Rendering all these websites all the time also costs a great deal of CPU time.

  3. #3
    Join Date
    Oct 2002
    Posts
    705
    Don't use a browser as you'll be wasting time rendering pages, making requests to images etc. You need to use a program like wget to which will just download the raw html, disabling reverse dns lookups will help as well, or just making the requets to an IP.

    Combine that with putting the server as physically close to the target website you want to scrape and thats about as fast as you can get.
    ServerMatingProject.com
    The World's first server mating experiment
    We give new meaning to I/O intensive and hot swap

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •