Last Time | Next Time |
This section describes how Google uses the "Pagerank" algorithm to determine the importance or value of a webpage (and hence where it falls in the search results for a particular topic).
Strogatz asserts that "A page is good if good pages link to it," then discusses this self-referential definition of a good page. (p. 193)
The question is this: who decides which pages are good in the first place? As Strogatz describes, the network does!
"Worrying about content turned out to be an impractical way to rank webpages." (p. 192) We left people vote with their feet (or rather, with their clicks.)
Graphs provide a useful way of illustrating how pages interact. If there's a link between two pages, then a directed arrow indicates it. Here's the graph of the "toy web" Strogatz considers, with the final rankings:
He justifies this ranking in a series of graphs, and a set of equations on page 195. Let's see how these equations work (we'll use this Excel spreadsheet).
He starts by assigning all pages equal weight: in this case, if we call the total weight 1, each page starts with weight 1/3.
Notice that I've written the equations with an index, , rather than with the primes. That's because we keep updating the values to get them at the stage, and we update based on the previous stage's () values.
We just "do it again", over and over....
This looks like a recurrence relation, such as the one that the Fibonacci numbers satisfy. It is! It just has three things going at once: the three weights for the three web pages (imagine how big the system is for the entire web!).
This "systems of equations" is an example from the field of mathematics called "linear algebra". If you loved algebra, wait -- there's more!:)
Now here's the big question: How do you improve the value of your website, given that you understand the PageRank algorithm?
Side note: Google's plan to prioritize facts ticks off climate deniers: The strategy isn't being implemented yet, but the paper presented a method for adapting algorithms such that they would generate a "Knowledge-Based Trust" score for every page. To do this, the algorithm would pick out statements and compare them with Google's Knowledge Vault, a database of facts. It would also attempt to assess the trustworthiness of sources -- for example, a reputable news site versus a newly created WordPress blog....