A search engine is a program
designed to help find files stored on a computer, for example a
public server on the World Wide Web, or one's own computer. The
search engine allows one to ask for media content meeting
specific criteria (typically those containing a given word or
phrase) and retrieving a list of files that match those criteria.
A search engine often uses a previously made, and regularly
updated index to look for files after the user has entered
search criteria.
In the context of the Internet, search
engines usually refer to the World Wide Web and not other
protocols or areas. Furthermore search engines mine data
available in newsgroups, large databases, or open directories
like DMOZ.org. Because the data collection is automated, they
are distinguished from Web directories, which are maintained by
people.
The vast majority of search engines are
run by private companies using proprietary algorithms and closed
databases, the most popular currently being Google (with MSN
Search and Yahoo! closely behind). There have been several
attempts to create open-source search engines, among which are
Htdig, Nutch, Egothor, and OpenFTS. [1] (http://www.searchtools.com/tools/tools-opensource.html)
History
The first Web search engine was "Wandex",
a now-defunct index collected by the World Wide Web Wanderer, a
web crawler developed by Matthew Gray at MIT in 1993. Another
very early search engine, Aliweb, also appeared in 1993 and
still runs today. One of the first engines to later become a
major commercial endeavor was Lycos, which started at Carnegie
Mellon University as a research project in 1994.
Soon after, many search engines appeared
and vied for popularity. These included WebCrawler, Hotbot,
Excite, Infoseek, Inktomi, and AltaVista. In some ways they
competed with popular directories such as Yahoo!. Later, the
directories integrated or added on search engine technology for
greater functionality.
In 2002, Yahoo! acquired Inktomi and in
2003, Yahoo! acquired Overture, which owned AlltheWeb and
Altavista. In 2004, Yahoo! launched its own search engine based
on the combined technologies of its acquisitions and providing a
service that gave pre-eminence to the Web search engine over the
directory.
Search engines were also known as some of
the brightest stars in the Internet investing frenzy that
occurred in the late 1990s. Several companies entered the market
spectacularly, recording record gains during their initial
public offerings. Some have completely taken off their public
search engine, and are marketing Enterprise-only editions, such
as Northern Light (http://www.northernlight.com/)
which used to be part of the 8 or 9 early search engines after
Lycos came out.
Before the advent of the Web, there were
search engines for other protocols or uses, such as the Archie
search engine for anonymous FTP sites and the Veronica search
engine for the Gopher protocol.
Osmar R. Zaïane's
From Resource Discovery to Knowledge Discovery on the Internet
details the history of search engine technology prior to the
emergence of Google.
Recent additions to the list of search
engines include a9.com, AlltheWeb, Ask Jeeves, Clusty, Gigablast,
Ez2Find, Teoma, WiseNut, GoHook, Walhello, Kartoo, Snap and
Mamma .
Google
Around 2001, the Google search engine rose
to prominence. Its success was based in part on the concept of
link popularity and
PageRank. How many other web sites and web pages link to a
given page is taken into consideration with PageRank, on the
premise that good or desirable pages are linked to more than
others. The PageRank of linking pages and the number of links on
these pages contribute to the PageRank of the linked page. This
makes it possible for Google to order its results by how many
web sites link to each found page. Google's minimalist user
interface was very popular with users, and has since spawned a
number of imitators.
Researchers
at NEC Research Institute claim to have improved upon Google's
patented PageRank technology by using web crawlers to find
"communities" of websites. Instead of ranking pages, this
technology uses an algorithm that follows links on a webpage to
find other pages that link back to the first one and so on from
page to page. Google and most other web engines utilize not only
PageRank but more than 150 criteria to determine relevancy. The
algorithm "remembers" where it has been and indexes the number
of cross-links and relates these into groupings. PageRank is
based on citation analysis that was developed in the 1950s by
Dr. Eugene Garfield at the University of Pennsylvania. Google's
founder's cite Garfield's work in their original paper. In this
way virtual communities of webpages are found. Teoma's search
technology uses a communities approach in its ranking algorithm.
Web link analysis was first developed by Dr. Jon Kleinberg and
his team while working on the CLEVER project at IBM's Almaden
research lab.
Challenges faced by search engines
The web is growing much faster than
any present-technology search engine can possibly index (see
distributed web crawling).
Many web pages are updated frequently,
which forces the search engine to revisit them periodically.
The queries one can make are currently
limited to searching for key words, which may results in
many false positives.
Dynamically generated sites, which may
be slow or difficult to index, or may result in excessive
results from a single site.
Many dynamically generated sites are
not indexable by search engines; this phenomenon is known as
the invisible web.
Some search engines do not order the
results by relevance, but rather according to how much money
the sites have paid them.
Some sites use tricks to manipulate
the search engine to display them as the first result
returned for some keywords. This can lead to some search
results being polluted, with more relevant links being
pushed down in the result list.
How search engines work
Web search engines work by storing
information about a large number of web pages, which they
retrieve from the WWW itself. These pages are retrieved by a web
crawler (sometimes also known as a spider) — an automated web
browser which follows every link it sees. The contents of each
page are then analyzed to determine how it should be indexed
(for example, words are extracted from the titles, headings, or
special fields called meta tags). Data about web pages is stored
in an index database for use in later queries. Some search
engines, such as Google, store all or part of the source page
(referred to as a cache) as well as information about the web
pages. This cached page always holds the actual search text
since it is the one that was actually indexed, so it can be very
useful when the content of the current page has been updated and
the search terms are no longer in it. This problem might be
considered to be a mild form of linkrot, and Google's handling
of it increases usability by satisfying user expectations that
the search terms will be on the returned web page.
When a user comes to the search engine and
makes a query, typically by giving key words, the engine looks
up the index and provides a listing of best-matching web pages
according to its criteria, usually with a short summary
containing the document's title and sometimes parts of the text.
There is another main type: Real-time
search engines (such as Orase (http://www.orase.com),
which is now defunct). Such search engines don't use an index.
The information that a search engine needs is only collected if
a new query is started. Compared to the index-based systems of
Google-like search engines this real-time system has some
advantages: The information are always up-to-date, there are
(almost) no dead links and less system resources are needed.
(Google uses almost 100,000 computers, Orase only one.) But
there are some disadvantages, too: A search needs longer to be
finished, for example.
The usefulness of a search engine depends
on the relevance of the results it gives back. While there may
be millions of Web pages that include a particular word or
phrase, some pages may be more relevant, popular, or
authoritative than others. Most search engines employ methods to
rank the results to provide the "best" results first. How a
search engine decides which pages are the best matches, and what
order the results should be shown in, varies widely from one
engine to another. The methods also change over time as Internet
usage changes and new techniques evolve.
Most Web search engines are commercial
ventures supported by advertising revenue and, as a result, some
employ the controversial practice of allowing advertisers to pay
money to have their listings ranked higher in search results.
!!! This article is licensed
under the GNU Free Documentation License, which means that you can copy and
modify it as long as the entire work (including additions) remains under this
license. See
http://www.gnu.org/copyleft/fdl.html for details. It uses material from the
Wikipedia article Search
Engines!!!
Google, Inc. (NASDAQ:
GOOG (http://quotes.nasdaq.com/asp/SummaryQuote.asp?symbol=GOOG&selected=GOOG)),
is a U.S.-based corporation, established in 1998, that manages
the Google search engine. Google is headquartered at the "Googleplex"
in Mountain View, California, and employs over 3,000 workers.
Google's CEO Dr. Eric Schmidt, formerly CEO of Novell, took over
when co-founder Larry Page stepped down.
History
Beginnings
Google began as a research project in
early 1996 by Larry Page and Sergey Brin, two Stanford Ph.D.
students who developed the theory that a search engine based on
analysis of the relationships between Web sites would produce
better results than the basic techniques then in use. It was
originally nicknamed BackRub because the system checked
backlinks to estimate a site's importance.
Convinced that the pages with the most
links to them from other highly relevant Web pages must be the
most relevant ones, Page and Brin decided to test their thesis
as part of their studies, and laid the foundation for their
search engine. They formally founded their company, Google,
Inc., on September 7, 1998 at a friend's garage in Menlo
Park, California. In February 1999, the company moved into
offices at 165 University Avenue in Palo Alto, home of a number
of other noted Silicon Valley technology startups. Google
quickly outgrew the University Avenue site, moving to a complex
of buildings (known by some as "The Googleplex") in Mountain
View's Amphitheater Parkway later that year.
The Google search engine gained a
following among Internet users for its simple, clean design and
relevant search results. In 2000, Google had begun selling
advertisements by the keyword so that they would be more
relevant to the end user. The ads were text-based in order to
keep page design uncluttered and fast-loading. The concept of
selling keyword advertising was originally pioneered by
Overture[1] (http://www.content.overture.com/d/USm/about/news/mile.jhtml),
formerly Goto.com. While many of its dot-com siblings went
under, Google quietly rose in stature while turning a profit.
U.S. Patent 6,285,999 (http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=6,285,999.WKU.&OS=PN/6,285,999&RS=PN/6,285,999)
describing Google's ranking mechanism (PageRank)
was granted on September 4, 2001. The patent was officially
assigned to Stanford University and lists Lawrence Page as the
inventor.
In February 2003, Google acquired Pyra
Labs, owner of Blogger, a pioneering and leading weblog-hosting
Web site. The acquisition seemed inconsistent with the general
mission of Google. However, the move secured the company's
ability to use information gleaned from blog postings to improve
the speed and relevance of articles contained in Google News.
At its peak in early 2004, Google handled
upwards of 80 percent of all search requests on the world wide
web through its Web site and clients like Yahoo!, AOL, and CNN.[2] (http://www.onestat.com/html/aboutus_pressbox21.html)
Google's share fell in February 2004 when Yahoo! dropped
Google's search technology in order to deliver independent
results.
Google's declared code of conduct is
Don't be evil. Their site includes humorous features such as
cartoon modifications
[3] (http://www.google.com/holidaylogos.html)
of their logo for special occasions, the option to display the
site in fictional or humorous languages such as Klingon and
Leet, and April Fool's jokes about the company.
It is conjectured that Google's response
to Yahoo will be personalized searches, using the personal data
that is gathering from Orkut, Gmail, and Froogle to give results
based on the individual. In fact, there is a Personalized Google
Search (http://labs.google.com/personalized)Beta
in Google Labs (http://labs.google.com/),
the experimental section of Google.com.
Etymology
The name "Google" is a play on the word
googol, which was coined by Milton Sirotta, nephew of U.S.
mathematician Edward Kasner in 1938, to refer to the number
represented by 1 followed by a hundred zeros. Google's use of
the term reflects the company's mission to organize the immense
amount of information available on the Web.
Financing and IPO
Google's major investors are the venture
capital firms Kleiner Perkins Caufield & Byers and Sequoia
Capital. In October 2003, while discussing a possible IPO
(Initial Public Offering of shares), the company was approached
by Microsoft about a possible partnership or merger; no such
deal ever materialized.
In January 2004, Google announced the
hiring of Morgan Stanley and Goldman Sachs Group to arrange an
IPO. That IPO (one of the most anticipated in history) was
projected to raise as much as $4 billion. According to a banker
involved in the transaction, the deal would yield an estimated
$12 billion market capitalization for Google.
On April 29, 2004, Google filed an S-1
form with the Securities and Exchange Commission for an IPO to
raise as much as USD $2,718,281,828 (with a touch of
mathematical humor). The filing revealed that Google turned a
profit every year since 2001 and earned a profit of $105.6
million on revenues of $961.8 million during 2003.
In May 2004, Google officially cut Goldman
Sachs from the IPO, leaving Morgan Stanley and Credit Suisse
First Boston as the joint underwriters. They chose the
unconventional way of allocation the initial offering through an
auction (and specifically a "Dutch auction"), so that "anyone"
would be able to participate in the offering. The smallest
required account balances at most authorized online brokers that
are allowed to participate in an IPO, however, are around
$100,000. In the run-up to the IPO the company was forced to
slash the price and size of the offering, but the process didn't
run into any technical difficulties or result in any significant
legal challenges. The initial offering of shares was sold for
$85 a piece. The public valued it at $100.34 at the close of the
first day of trading which saw 22,351,900 shares change hands.
After some initial stumbles, Google's
initial public offering took place on August 19, 2004.
19,605,052 shares were offered at a price of $85 per share. Of
that, 14,142,135 were floated by Google and 5,462,917 by selling
stockholders. The sale raised $1.67 billion, of which
approximately $1.2 billion went to Google. The vast majority of
Google's 271 million shares remained under Google's control. The
IPO gave Google a market capitalization of more than $23
billion. Many of Google's employees became instant paper
millionaires. Ironically Yahoo! also benefited from the IPO
because it owns 2.7 million shares of Google. The company was
listed on the NASDAQ stock exchange under the ticker symbol
GOOG.
Since the IPO, Google's stock market
capitalization has risen to $50 billion as the stock price has
doubled. On August 19 2004 the number of shares outstanding was
172.85 million while the "free float" was 19.60 million (which
makes 89% held by insiders). In January 2005 the shares
outstanding was up 100 million to 273.42 million, 53% of that
was held by insiders which made the float 127.70 million (up 110
million shares from the first trading day). The two founders are
said to hold almost 30% of the outstanding shares. The company
has not reported any treasury stock holdings as of the Q3 2004
report.
Corporate culture
Philosophy
Google is known for its relaxed corporate
culture, reminiscent of the Dot-com boom. Google's corporate
philosophy is based on many casual principles including, "You
can make money without doing evil", "You can be serious without
a suit" and "work should be challenging and the challenge should
be fun." A complete list of corporate fundamentals is available
on Google's web site
[4] (http://www.google.com/corporate/tenthings.html).
The company encourages equality along the corporate levels and
tells its employees to work on a personal project one day a
week. Twice a week there is a roller hockey game in the company
parking lot.
Twenty Percent Rule
Each Google employee is allowed to spend
20% of their work week developing new products. Some of these
end up as Google services (most notably Google News)
Googleplex
The Googleplex's lobby (Google
headquarters) is decorated with a piano, lava lamps and a real
time projection of current search queries. The hallways are full
of exercise balls and bicycles. Each employee has a Linux
workstation and access to the corporate recreation center. The
recreation center includes a workout room with weights and
rowing machines, locker rooms, washers and dryers, a massage
room, assorted video games, Foosball, a baby grand piano, a pool
table and ping pong. In addition to the rec room, there are
snack rooms stocked with various cereals, gummy bears, M&Ms,
toffee, licorice, cashews, yogurt, carrots, fresh fruit, and
dozens of different drinks including fresh juice, soda and
make-your-own cappuccino. After eating, people can relieve
themselves on digital toilets similar to Japanese toilets.
IPO and culture
Many people have suggested that after
Google's IPO their culture will not be able to stay so "fun" and
focused on the future.[5] (http://www.wired.com/news/business/0,1367,63241,00.html?tw=wn_story_related)
[6] (http://www.ciol.com/content/news/2004/104043001.asp)
The company may be required to answer to shareholders who will
want the company to cut back on employee benefits and to focus
on short term advances. Also, it may be hard to maintain a
collegial atmosphere when approximately 1,000 (30%) of the
employees are paper-millionaires. In a report given to potential
investors, co-founders Sergey Brin and Larry Page promised that
the IPO would not change the company's culture. Later Mr. Page
said, "We think a lot about how to maintain our culture and the
fun elements."
Criticism and controversy
Despite Google's apparent success it has
also managed to become the target of critics.
Copyright issues
A number of organizations have used the
Digital Millennium Copyright Act to demand that Google remove
references to allegedly copyrighted material on other sites.
Google typically handles this by removing the link as requested
and including a link to the complaint in the search results.
There have also been complaints that
Google's web cache feature violates copyright. However, Google
provides mechanisms for requesting that caching be disabled
(which Google respects; it also honors the robots.txt file which
is another mechanism that allows operators of a website to
request that part or all of their site not be included in search
engine results).
Multinational Corporation
Google is a multinational corporation,
having offices in over a dozen countries
[7] (http://www.google.com/jobs/positions.html).
In order to comply with the varying laws of these countries,
several versions of Google restrict very specific keyword
searches. According to French and German law, for example,
ethnocentrism and historical revisionism are illegal. Google
complies with these laws by banning keyword searches related to
these terms. China, whose human rights record has been widely
criticized by the international community, has in the past
restricted citizen access to popular search engines such as
Altavista, Yahoo, and Google. This complete ban is currently
lifted, however the government remains proactive in filtering
internet content.[8] (http://journalism.berkeley.edu/projects/chinadn/en/archives/002885.html)
Partiality
In February 2003, Google banned the ads of
Oceana, a two and a half year old non-profit organization, which
was protesting the environmental effects of a major cruise ship
operations' sewage treatment practices. Google claimed that
their editorial policy states, "that Google does not accept
advertising if the ad or site advocates against other
individuals, groups, or organizations."
Offensive search results
In April 2004, Google received complaints
that a search for "Jew" on its site listed the anti-Jewish
website Jew Watch at or towards the top of the list. Google
insisted this was a result of their content-oblivious PageRank
algorithm.
[10] (http://www.google.com/explanation.html).
Privacy
Some have pointed out the privacy
implications of having a centrally located, widely popular data
warehouse of millions of internet users' searches, and how under
existing US law, Google would be required to hand over all such
information to the US government.
It has been claimed that Google infringes
the privacy of visitors by uniquely identifying them using
cookies which are used to track web user's search history. The
cookies possess excessively distant expiry dates and it is
claimed users' searches are recorded without permission for
advertising purposes. In response Google claims cookies are
necessary to maintain user preferences between sessions and
offer other search features. The use of cookies with distant
expiry dates is not uncommon.
Some users believe the processing of email
message content by Google's GMail service goes beyond proper
use. The point is often made that people without GMail accounts,
who have not agreed to the GMail terms of service, but send
email to GMail users have their correspondence analyzed without
permission. Google claims that mail sent to or from GMail is
never read by a human being beyond the account holder, and is
only used to improve relevance of advertisements. Other popular
email services such as Hotmail also scan incoming email to try
to determine whether it is unsolicited email.
Chris Hoofnagle, associate director of the
Electronic Privacy Information Center in Washington, DC warned
that "As courts become more frequent integrators of electronic
records, there is a greater risk of Google ... becoming a
serious privacy threat."
The PageRank system
Google's central PageRank system has been
criticized, some calling it "undemocratic". Common arguments are
that the system is unfairly biased towards large web sites, and
that the criteria for a page's importance are not subject to
peer review. The system is also highly susceptible to
manipulation and fraud through the use of dummy sites. See
Google bomb.
Google Offers Wikimedia Hosting
On February 11, 2005, news [11] (http://news.com.com/Google+may+host+encyclopedia+project/2100-1038_3-5572744.html?tag=nefd.top)
[12] (http://meta.wikimedia.org/wiki/Google_hosting)
emerged that discussions are in progress over the possibility of
Google hosting a section of the Wikimedia Foundation's
information on donated servers and internet transit. Early
information states that no advertising (such as Google's
AdWords) would be necessary on Wikimedia's projects. The
Foundation confirmed that the two groups will hold a private IRC
chat in March to further discuss possibilities, and stressed
that no details have yet been finalised. As a result, some
outsiders have coined the unofficial term "Googlepedia".
!!! This article is licensed under the GNU
Free Documentation License, which means that you can copy and modify it as long
as the entire work (including additions) remains under this license. See
http://www.gnu.org/copyleft/fdl.html for details. It uses material from the
Wikipedia article Google!!!
Yahoo! Inc. (NYSE:
YHOO (http://www.nyse.com/about/listed/lcddata.html?ticker=YHOO))
is an American computer services company with a mission to "be
the most essential global internet service for consumers and
businesses". It operates an Internet portal, a web directory and
a host of other services including the popular Yahoo! Mail. It
was founded by Stanford graduate students David Filo and Jerry
Yang in January 1994 and incorporated on March 2nd, 1995. The
company is headquartered in Sunnyvale, California.
According to Alexa Internet, a web trends
company, Yahoo is the most visited website on the Internet
today. The global network of Yahoo websites received 3 billion
page views per day as of October 2004.
History
Yahoo started out as "Jerry's Guide to the
World Wide Web" but eventually received a new moniker with the
help of a dictionary. The name Yahoo is an acronym for "Yet
Another Hierarchical Officious Oracle," but Filo and Yang insist
they selected the name because they liked the general definition
of a yahoo, as in
Gulliver's Travels by Jonathan Swift: "rude, unsophisticated,
uncouth." Yahoo itself first resided on Yang's student
workstation, "Akebono," while the software was lodged on Filo's
computer, "Konishiki"—both named after legendary sumo wrestlers.
The "yet another" phrasing goes back at least to the Unix
utility yacc, whose name is an acronym for "yet another compiler
compiler".
Yahoo had its initial public offering on
April 12, 1996, selling 2.6 million shares at $13 each.
As Yahoo's popularity has increased, so
has the range of features it offers, making it a kind of
one-stop shop for all the popular activities of the Internet.
These now include: Yahoo! Mail, a web-based e-mail service, an
instant messaging client, a very popular mailing list service
(Yahoo! Groups), online gaming and chat, various news and
information portals, online shopping and auction facilities, and
an online payment system (similar to PayPal) called Yahoo!
Paydirect. Many of these are based at least in part on
previously independent services, which Yahoo has acquired - such
as the popular GeoCities free web-hosting service, Rocketmail,
and various competing mailing list providers such as eGroups.
Many of these take-overs were controversial and unpopular with
users of the existing services, as Yahoo often changed the
relevant terms of service. An example of this would be their
claiming intellectual property over content on their servers,
which the old companies had not.
Yahoo has now begun making partnerships
with telecommunications and Internet providers - such as BT in
the UK, Rogers in Canada and SBC in the US - to create
content-rich broadband services to rival those offered by AOL.
The company offers a branded credit card, Yahoo! Visa,
through a partnership with First USA.
Yahoo was one of the few surviving large Internet companies after the
dot-com bubble burst. Nevertheless, on September
26, 2001, Yahoo stocks closed at an all-time low
of $4.06.
Yahoo formed partnerships with
telecommunications and Internet providers to
create content-rich broadband services to
compete with AOL. On 3 June 2002, SBC and Yahoo
launched a national co-branded dial service. In
July 2003, BT Openworld announced an alliance
with Yahoo On 23 August 2005, Yahoo and Verizon
launched an integrated DSL service.
In late 2002, Yahoo began to bolster its
search services by acquiring other search
engines. In December 2002, Yahoo acquired
Inktomi. In February 2003, Yahoo acquired
Konfabulator and rebranded it Yahoo! Widgets, a
desktop application and in July 2003, it
acquired Overture Services, Inc. and its
subsidiaries AltaVista and AlltheWeb. On
February 18, 2004, Yahoo dropped Google-powered
results and returned to using its own technology
to provide search results.
Google then released Gmail, its webmail
service offering 1 GB of storage, on 1 April
2004. Yahoo responded by upgrading the storage
of all free Yahoo Mail accounts from 4 MB to 1
GB, and all Yahoo Mail Plus accounts to 2 GB. In
2007, Yahoo took out the storage meters and made
the storage limit unlimited. On 9 July 2004,
Yahoo acquired e-mail provider Oddpost to add an
Ajax interface to Yahoo! Mail Beta. Google also
released Google Talk, a Voice over IP and
instant messaging service, on 24 August 2005. On
13 October 2005, Yahoo and Microsoft announced
that Yahoo! Messenger and MSN Messenger would
become interoperable.
Yahoo continued acquiring companies to expand
its range of services, particularly Web 2.0
services. Yahoo Launch became Yahoo! Music on 9
February 2005. On 20 March 2005, Yahoo purchased
photo sharing service Flickr. On 29 March 2005,
the company launched its blogging and social
networking service Yahoo! 360°. In June 2005,
Yahoo acquired blo.gs, a service based on RSS
feed aggregation. Yahoo then bought online
social event calendar Upcoming.org on 4 October
2005. Yahoo acquired social bookmark site
del.icio.us on 9 December 2005 and then playlist
sharing community webjay on 9 January 2006.
On 27 August 2007, Yahoo released a new
version of Yahoo! Mail that makes it possible
for users to send instant messages to the
largest combined instant messaging (IM)
community including users of Yahoo! Messenger
and Windows Live Messenger, to send free text
messages to mobile phones in the U.S., Canada,
India and the Philippines.
!!! This article is licensed
under the GNU Free Documentation License, which means that you can copy and
modify it as long as the entire work (including additions) remains under this
license. See
http://www.gnu.org/copyleft/fdl.html for details. It uses material from the
Wikipedia article Yahoo!!!
Brin, S. and Page, L., "The
Anatomy of a Large-Scale Hypertextual Web Search Engine". In Proceedings
of 7th Int. World Wide Web Conference, (1998):
"This paper addresses this question of how to build a practical large-scale
system which can exploit the additional information present in hypertext."
Taher Haveliwala. "Topic-Sensitive
PageRank," In Proceedings of the Eleventh International World Wide
Web Conference, May 2002.
Taher Haveliwala, Sepandar
Kamvar, Dan Klein, Christopher Manning, and Gene Golub. "Computing
PageRank using Power Extrapolation," Stanford University Technical
Report, July 2003