Results 1 to 4 of 4
  1. #1

    Page Rank script

    Where to find a good / free PHP Page Rank script
    something like the one used here
    eNom very discounted reseller / retail accounts (Click here )

    Rare 2 , 3 char / letter domains for sale ( Click here )

  2. #2
    Join Date
    Nov 2001
    the above link goes against google's TOS and their web api doesn't provide a way.

    people have sniffed the packets from the toolbar to determine the URL, which sends back an XML file. the problem is there is a checksum in the url and it is a moving target as google occasionally changes it.

    a good discussion about the problem can be found here:

  3. #3
    Thank you for your reply but Doesn't work because it's not free forum
    eNom very discounted reseller / retail accounts (Click here )

    Rare 2 , 3 char / letter domains for sale ( Click here )

  4. #4
    Join Date
    Nov 2001
    Strange, when I clicked from google I was able to get in, but from here I am not able to... strange.

    here is some of the better quotes
    Almost got it figured out. I did a packet sniff(had to remember how that worked) and recovered the googlebar requesting the following xml document for

    I can access this xml document directly within ie5. I assume other browsers support xml too. It contains lots of cool info in plan text, including the page rank. The only problem is the ch variable seems to be some type of redundant encrpytion of the url. In other have to know the correct ch to get the xml document. It might also be encrypted to your specific ip, so Im not sure that you will be able to access my page. Anybody know how to generate the ch?
    If you do a search for this:

    in which "" is the PageRank you want, you will get back this from Google (all angle brackets were changed to braces):

    {?xml version="1.0" encoding="ISO-8859-1" standalone="no" ?}
    {!DOCTYPE GSP (View Source for full doctype...)}
    - {GSP VER="3.1"}
    - {RES SN="1" EN="1"}
    - {R N="1" L="1"}
    {T}MyDomain Name Search{/T}
    {S}MyDomain name search. If you can't spell somebody's name, use{br} your best guess for their last name only: Last name only: {b}...{/b}{/S}
    - {HAS}
    {L TAG="link:" /}
    {C SZ="3k" TAG="cache:" /}
    {RT TAG="related:" /}

    The PageRank is between the {RK} and {/RK} -- in this case, it's a 6.

    You can see that the title and the first sentence on the page also come back. If the page is in the ODP directory (not the case in this example) this info also comes back from Google, with the category that it is in.

    However, there's a catch that makes it more complex. You need the "ch=0123456789" in the query string. It appears to be a ten-digit checksum based on the domain name you are requesting. If the number does not match with that domain, from which it is apparently generated within the toolbar code, you get a "not authorized" message in Explorer instead of the above information.

    Writing a script would require knowing how this 10-digit checksum is generated. You'd have to collect a bunch of domains and checksums, and try to see if there's a pattern. It might be a simple checksum, or it might even be some sort of one-way hash.

    I don't think it's worth the effort for a single-digit PageRank.
    The checksum varies by page requested and not just domain, i.e has a different checksum to - of course for PR this is probably not a problem.

    As doofus says the calculations necessary make this a tricky task, while its definately cool to see the data being recieved theres probably limited scope for automating it

    I suspect as soon as the checksum algo was decoded it could be changed anyway since the googlebar is self updating, it would be a moving target.
    Unless someone knows how to reverse engineer the software itself. I looked at a few of the checksums(came to the same conclusion). It appears as though they are pretty diverse. For example: gives a checksum of 14282204401 and gives a checksum of 13409342805.

    Anyways, Im sure there is some method of stepping through the software with a debugger or whatnot to determine the checksum method, but just hacking it out by looking at patterns seems unlikely. Oh well.
    The checksum algorithm was changed by Google sometime in May 2002. It was consistent from December 2001 (or earlier) to May 2002, but then it changed.

    It's not too surprising that the algo was changed. What's more surprising is that Google cleverly does not return an error message for PageRank queries coming in that use the obsolete checksum. Instead of an error message, you get bogus PageRank values. These values are typically plus or minus two complete digits on the 0-10 scale. Sites that were a 7 might be a 9. One site that was an 8 became a 10.

    This is the famous Google sense of humor at work.

    Since the toolbar is self-updating, the checksum algo can be made a moving target. Anyone who goes to all the trouble to decompile and analyze the algo, still has to keep checking with the latest toolbar in Explorer, to make sure the PR values coming back are not bogus due to a change on Google's end. Whatever clever program anyone writes after cracking the checksum algo will not be self-updating from Google, I presume.

    None of us likes using Explorer with the Google toolbar. But Google makes the rules, and Google finds ways to make us play by their rules.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts