Calculates the Google PageRank for the specified vertices.
page_rank(
graph,
algo = c("prpack", "arpack", "power"),
vids = V(graph),
directed = TRUE,
damping = 0.85,
personalized = NULL,
weights = NULL,
options = NULL
)page_rank_old(
graph,
vids = V(graph),
directed = TRUE,
niter = 1000,
eps = 0.001,
damping = 0.85,
old = FALSE
)
The graph object.
Character scalar, which implementation to use to carry out the
calculation. The default is "prpack"
, which uses the PRPACK library
(https://github.com/dgleich/prpack). This is a new implementation in igraph
version 0.7, and the suggested one, as it is the most stable and the fastest
for all but small graphs. "arpack"
uses the ARPACK library, the
default implementation from igraph version 0.5 until version 0.7.
power
uses a simple implementation of the power method, this was the
default in igraph before version 0.5 and is the same as calling
page_rank_old
.
The vertices of interest.
Logical, if true directed paths will be considered for directed graphs. It is ignored for undirected graphs.
The damping factor (‘d’ in the original paper).
Optional vector giving a probability distribution to calculate personalized PageRank. For personalized PageRank, the probability of jumping to a node when abandoning the random walk is not uniform, but it is given by this vector. The vector should contains an entry for each vertex and it will be rescaled to sum up to one.
A numerical vector or NULL
. This argument can be used
to give edge weights for calculating the weighted PageRank of vertices. If
this is NULL
and the graph has a weight
edge attribute then
that is used. If weights
is a numerical vector then it used, even if
the graph has a weights
edge attribute. If this is NA
, then no
edge weights are used (even if the graph has a weight
edge attribute.
This function interprets edge weights as connection strengths. In the
random surfer model, an edge with a larger weight is more likely to be
selected by the surfer.
Either a named list, to override some ARPACK options. See
arpack
for details; or a named list to override the default
options for the power method (if algo="power"
). The default options
for the power method are niter=1000
and eps=0.001
. This
argument is ignored if the PRPACK implementation is used.
The maximum number of iterations to perform.
The algorithm will consider the calculation as complete if the difference of PageRank values between iterations change less than this value for every node.
A logical scalar, whether the old style (pre igraph 0.5) normalization to use. See details below.
For page_rank
a named list with entries:
A numeric vector with the PageRank scores.
The eigenvalue corresponding to the eigenvector with the page rank scores. It should be always exactly one.
Some information about the underlying
ARPACK calculation. See arpack
for details. This entry is
NULL
if not the ARPACK implementation was used.
For page_rank_old a numeric vector of Page Rank scores.
For the explanation of the PageRank algorithm, see the following webpage: http://infolab.stanford.edu/~backrub/google.html, or the following reference:
Sergey Brin and Larry Page: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the 7th World-Wide Web Conference, Brisbane, Australia, April 1998.
igraph 0.5 (and later) contains two PageRank calculation implementations.
The page_rank
function uses ARPACK to perform the calculation, see
also arpack
.
The page_rank_old
function performs a simple power method, this is
the implementation that was available under the name page_rank
in pre
0.5 igraph versions. Note that page_rank_old
has an argument called
old
. If this argument is FALSE
(the default), then the proper
PageRank algorithm is used, i.e. \((1-d)/n\) is added to the weighted
PageRank of vertices to calculate the next iteration. If this argument is
TRUE
then \((1-d)\) is added, just like in the PageRank paper;
\(d\) is the damping factor, and \(n\) is the total number of vertices.
A further difference is that the old implementation does not renormalize the
page rank vector after each iteration. Note that the old=FALSE
method is not stable, is does not necessarily converge to a fixed point. It
should be avoided for new code, it is only included for compatibility with
old igraph versions.
Please note that the PageRank of a given vertex depends on the PageRank of all other vertices, so even if you want to calculate the PageRank for only some of the vertices, all of them must be calculated. Requesting the PageRank for only some of the vertices does not result in any performance increase at all.
Since the calculation is an iterative process, the algorithm is stopped after a given count of iterations or if the PageRank value differences between iterations are less than a predefined value.
Sergey Brin and Larry Page: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proceedings of the 7th World-Wide Web Conference, Brisbane, Australia, April 1998.
Other centrality scores: closeness
,
betweenness
, degree
# NOT RUN {
g <- sample_gnp(20, 5/20, directed=TRUE)
page_rank(g)$vector
g2 <- make_star(10)
page_rank(g2)$vector
# Personalized PageRank
g3 <- make_ring(10)
page_rank(g3)$vector
reset <- seq(vcount(g3))
page_rank(g3, personalized=reset)$vector
# }
Run the code above in your browser using DataLab