Fit Privacy-preserving Distributed Algorithms for linear, logistic, Poisson and Cox PH regression with possible heterogeneous data across sites.
pda(ipdata=NULL,site_id,control=NULL,dir=NULL,uri=NULL,secret=NULL,
upload_without_confirm=F, silent_message=F, digits=4,hosdata=NULL)control
control
Local IPD data in data frame, should include at least one column for the outcome and one column for the covariates
Character site name
pda control data
directory for shared flat file cloud
Universal Resource Identifier for this run
password to authenticate as site_id on uri
logical. TRUE if want silent upload, no interactive confirm
logical. TRUE if want to mute message
digits after decimal points in the output json files
(for dGEM) hospital-level data, should include the same name as defined in the control file
Michael I. Jordan, Jason D. Lee & Yun Yang (2019) Communication-Efficient Distributed Statistical Inference,
Journal of the American Statistical Association, 114:526, 668-681
tools:::Rd_expr_doi("10.1080/01621459.2018.1429274").
(DLM) Yixin Chen, et al. (2006) Regression cubes with lossless compression and aggregation.
IEEE Transactions on Knowledge and Data Engineering, 18(12), pp.1585-1599.
(DLMM) Chongliang Luo, et al. (2020) Lossless Distributed Linear Mixed Model with Application to Integration of Heterogeneous Healthcare Data.
medRxiv, tools:::Rd_expr_doi("10.1101/2020.11.16.20230730").
(DPQL) Chongliang Luo, et al. (2021) dPQL: a lossless distributed algorithm for generalized linear mixed model with application to privacy-preserving hospital profiling.
medRxiv, tools:::Rd_expr_doi("10.1101/2021.05.03.21256561").
(ODAL) Rui Duan, et al. (2020) Learning from electronic health records across multiple sites:
A communication-efficient and privacy-preserving distributed algorithm.
Journal of the American Medical Informatics Association, 27.3:376–385,
tools:::Rd_expr_doi("10.1093/jamia/ocz199").
(ODAC) Rui Duan, et al. (2020) Learning from local to global: An efficient distributed algorithm for modeling time-to-event data.
Journal of the American Medical Informatics Association, 27.7:1028–1036,
tools:::Rd_expr_doi("10.1093/jamia/ocaa044").
(ODACH) Chongliang Luo, et al. (2021) ODACH: A One-shot Distributed Algorithm for Cox model with Heterogeneous Multi-center Data.
medRxiv, tools:::Rd_expr_doi("10.1101/2021.04.18.21255694").
(ODAH) Mackenzie J. Edmondson, et al. (2021) An Efficient and Accurate Distributed Learning Algorithm for Modeling Multi-Site Zero-Inflated Count Outcomes.
medRxiv, pp.2020-12.
tools:::Rd_expr_doi("10.1101/2020.12.17.20248194").
(ADAP) Xiaokang Liu, et al. (2021) ADAP: multisite learning with high-dimensional heterogeneous data via A Distributed Algorithm for Penalized regression.
(dGEM) Jiayi Tong, et al. (2022) dGEM: Decentralized Generalized Linear Mixed Effects Model
(COLA) Wu, Q., Reps, J.M., Li, L. et al. COLA-GLM: collaborative one-shot and lossless algorithms of generalized linear models for decentralized observational healthcare data. npj Digit. Med. 8, 442 (2025). https://doi.org/10.1038/s41746-025-01781-1.
(ODACT) Liang CJ, Luo C, Kranzler HR, Bian J, Chen Y. Communication-efficient federated learning of temporal effects on opioid use disorder with data from distributed research networks. J Am Med Inform Assoc. 2025 Apr 1;32(4):656-664. doi: 10.1093/jamia/ocae313. PMID: 39864407; PMCID: PMC12005629.
(DisC2o) Tong J, et al. 2025. DisC2o-HD: Distributed causal inference with covariates shift for analyzing real-world high-dimensional data. Journal of Machine Learning Research. 2025;26(3):1-50.
pdaPut, pdaList, pdaGet, getCloudConfig and pdaSync.