Learn R Programming

PAFit (version 1.2.10)

PAFit-package: Generative Mechanism Estimation in Temporal Complex Networks

Description

A package for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks. References: Thong Pham et al. (2015) <10.1371/journal.pone.0137796>, Thong Pham et al. (2016) <doi:10.1038/srep32558>, Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>, Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.

Arguments

Author

Thong Pham thongphamthe@gmail.com, Paul Sheridan, and Hidetoshi Shimodaira.

Details

Package:PAFit
Type:Package
Version:1.2.10
Authors:Thong Pham, Paul Sheridan, Hidetoshi Shimodaira
Maintainer:Thong Pham thongphamthe@gmail.com
Date:2024-03-28
License:GPL-3

The PAFit package provides a comprehensive framework to deal with growth mechanisms of temporal complex networks. In particular, it implements functions to simulate various temporal network models, gather essential network statistics from raw input data, and use these summarized statistics in the estimation of the attachment function \(A_k\) and node fitnesses \(\eta_i\). The heavy computational parts of the package are implemented in C++ through the use of the Rcpp package. Furthermore, users with a multi-core machine can enjoy a hassle-free speed up through OpenMP parallelization mechanisms implemented in the code. Apart from the main functions, the package also includes a real-world collaboration network dataset between scientists in the field of complex networks (coauthor.net). The main package functionalities are as follows.

Firstly, most well-known temporal network models based on the preferential attachment (PA) and node fitness mechanisms can be easily simulated using the package. PAFit implements generate_BA for the Barabási-Albert (BA) model, generate_ER for the growing Erdős–Rényi (ER) model, generate_BB for the Bianconi-Barabási (BB) model and generate_fit_only for the Caldarelli model. These functions have many customizable options, for example the number of new edges at each time-step are tunable stochastic variables. They are actually wrappers of the more powerful generate_net function, which simulates networks with more flexible attachment function and node fitness settings.

Secondly, the function get_statistics efficiently collects all temporal network summary statistics. We note that get_statistics automatically handles both directed and undirected networks. It returns a list containing many statistics that can be used to characterize the network growth process. Notable fields are m_tk containing the number of new edges that connect to a degree-\(k\) node at time-step \(t\), and node_degree containing the degree sequence, i.e., the degree of each node at each time-step.

The most important functionality of the package is estimating the attachment function and node fitnesses of a temporal network. This is implemented through various methods. There are three usages: estimation of the attachment function in isolation, estimation of the node fitnesses in isolation, and the joint estimation of the attachment function and node fitnesses.

  • The functions for estimating the attachment function in isolation are: Jeong for Jeong's method (Ref. 1), Newman for Newman's method (Ref. 2), and only_A_estimate for the PAFit method (Ref. 3).

  • For estimation of node fitnesses in isolation, only_F_estimate implements a variant of the PAFit method (Ref. 4).

  • For the joint estimation of the attachment function and node fitnesses, we implement the full version of the PAFit method in joint_estimate (Ref. 4).

  • For estimating the nonparametric attachment function from a single snapshot, use PAFit_oneshot (Ref. 6).

Excluding PAFit_oneshot, the input of the remaining functions is the output object of the function get_statistics. The output object of these functions contains the estimation results as well as some additional information pertaining to the estimation process. The estimated attachment function and/or node fitnesses can be plotted by using the plot command directly on this output object. This will visualize not only the estimated results but also the remaining uncertainties when possible.

References

1. Jeong, H., Néda, Z. & Barabási, A. (2003). Measuring Preferential Attachment in Evolving Networks. Europhysics Letters 61(61):567-572. (tools:::Rd_expr_doi("10.1209/epl/i2003-00166-9")).

2. Newman, M. (2001). Clustering and Preferential Attachment in Growing Networks. Physical Review E 64(2):025102. (tools:::Rd_expr_doi("10.1103/PhysRevE.64.025102")).

3. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLOS ONE 10(9):e0137796. (tools:::Rd_expr_doi("10.1371/journal.pone.0137796")).

4. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (tools:::Rd_expr_doi("10.1038/srep32558")).

5. Pham, T., Sheridan, P. & Shimodaira, H. (2020). PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks. Journal of Statistical Software 92 (3). (tools:::Rd_expr_doi("10.18637/jss.v092.i03"))

6. Pham, T., Sheridan, P. & Shimodaira, H. (2021). Non-parametric estimation of the preferential attachment function from one network snapshot. Journal of Complex Networks 9(5): cnab024. (tools:::Rd_expr_doi("10.1093/comnet/cnab024")).

See Also

See the accompanying vignette for a tutorial.

See also the GitHub page.

Examples

Run this code
if (FALSE) {
  ### Jointly estimate the attachment function and node fitnesses
   library("PAFit")
   set.seed(1)
  # a Bianconi-Barabasi network 
  # size of initial network = 100
  # number of new nodes at each time-step = 100
  # Ak = k; inverse variance of distribution of fitness: s = 10
  net        <- generate_BB(N        = 1000 , m             = 10 , 
                            num_seed = 100  , multiple_node = 100,
                            s        = 10)
  net_stats  <- get_statistics(net)
  
  #Joint estimation of attachment function Ak and node fitness
  result     <- joint_estimate(net, net_stats)
  
  summary(result)
  
  # plot the estimated attachment function
  plot(result, net_stats)
  
  # true function
  true_A     <- pmax(result$estimate_result$center_k,1)
  lines(result$estimate_result$center_k, true_A, col = "red") # true line
  legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
  #plot distribution of estimated node fitnesses
  plot(result, net_stats, plot = "f")
  
  #plot the estimated node fitnesses and true node fitnesses
  plot(result, net_stats, true = net$fitness, plot = "true_f")
}

Run the code above in your browser using DataLab