
|
View Full Version : Google wierdness
cperciva 11-10-2002, 12:29 PM Try googling for "rsa library". The result at the top, from web.comlab.ox.ac.uk, is some of my code (or rather, documentation for it). How did it get up there? There's only one page which links to it; and only one page which links to that page. Yes, it is entirely relevant for the search query, but no more so than dozens of other RSA libraries. Maybe Google gives a major bonus to web pages on "academic" servers?
I suppose I should just be happy about this, but it really is boggling.
filburt1 11-10-2002, 01:21 PM PageRank is indeed the biggest mystery of the Intarweb. :)
Originally posted by filburt1
PageRank is indeed the biggest mystery of the Intarweb. A bigger one is, "what the hell is the Intarweb?" :)
Seriously, nothing surprising about that ranking at all, and it has little to do with the mysterious PageRank. In fact, the PR of that page is only (approximately) 3... but that seems to be as good as that of any page listed. The page's linking is weak, but so is that of every page returned for the query. And it's important to remember that PageRank is only one element in Googles ranking algorithms, and tends to be less significant in noncompetitive categories and those in which prevailing PR is low -- like this one.
The page ranks well for that search term because it's the only one of the set that has the keywords both in the document title and prominently in an H1 tag at the top of the page; and one of the keywords are in the filename.
But again the biggest factor is that "rsa library" is not a competitive search term.
cperciva 11-12-2002, 12:00 PM Originally posted by JayC
Seriously, nothing surprising about that ranking at all, and it has little to do with the mysterious PageRank. In fact, the PR of that page is only (approximately) 3... but that seems to be as good as that of any page listed.
Ok, another question then: If being linked to by only one page -- a page which, itself, is only linked to by one page -- is enough to give you a PageRank of 3, how is it possible to get a pagerank of less than 3? Have *no* pages linking to you?
Originally posted by cperciva
Ok, another question then: If being linked to by only one page -- a page which, itself, is only linked to by one page -- is enough to give you a PageRank of 3, how is it possible to get a pagerank of less than 3? Have *no* pages linking to you? Actually, I'm seeing that linking page as having a PR4, not a PR3. But also one perplexing thing about Google is that the link: command doesn't return all of the linking pages, it shows only those above some PageRank level, usually those with a 4 or greater -- as appears to be the case here. So in fact there are more links than that one; this query (http://www.google.com/search?q=%22web.comlab.ox.ac.uk/oucl/work/colin.percival/source/lib/rsa.html%22&hl=en&lr=&ie=UTF-8&filter=0) shows that there are probably about 23 (but I didn't check all of these to confirm that they are actually links).
cperciva 11-14-2002, 08:07 AM No, those aren't links; they show up because they share several "words" in their URLs -- they are other files in the same directory. The only "link" to the page in question is from the automagic index which Apache generates for the directory (because I haven't put up an index.html file).
Originally posted by cperciva
No, those aren't links; they show up because they share several "words" in their URLs -- they are other files in the same directory. Yep, on closer look you're right. The problem with that last query (usually that approach works) is that the url was too long. The different components of a url-based query are broken into words, and Google limits queries to 10 words -- so the "lib/rsa.html" segment was ignored, and that resulted in many other pages were returned. Then at my quick glance at long pages with lots of links, I assumed most would be right, but didn't examine it!
But this query (http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&as_qdr=all&q=+%22ox+%2Bac+%2Buk+%2Boucl+%2Bwork+%2Bcolin+%2Bpercival+%2Bsource+%2Blib+%2Brsa.html%22&btnG=Google+Search) leaves off the first part of your url in favor of the last part, and returns just the one linking page.
So... even with one link: the linking page has an indicated PageRank of 4 (which, since the toolbar shows the PR value "rounded down" could be higher; anywhere below 5). So it's not really surprising for that link to bring a PR3. It's a logarithmic scale; a 3 is pretty easy to get.
|