It converts a SCOPUS, Clarivate Analytics WoS, Dimensions, Lens.org, PubMed and COCHRANE Database export files or pubmedR and dimensionsR JSON/XML objects into a data frame, with cases corresponding to articles and variables to Field Tags as used in WoS.
convert2df(
file,
dbsource = "wos",
format = "plaintext",
remove.duplicates = TRUE
)
a data frame with cases corresponding to articles and variables to Field Tags in the original export file.
I.e We have three files download from Web of Science in plaintext format, file will be:
file <- c("filename1.txt", "filename2.txt", "filename3.txt")
data frame columns are named using the standard Clarivate Analytics WoS Field Tag codify. The main field tags are:
AU | Authors | |
TI | Document Title | |
SO | Publication Name (or Source) | |
JI | ISO Source Abbreviation | |
DT | Document Type | |
DE | Authors' Keywords | |
ID | Keywords associated by SCOPUS or WoS database | |
AB | Abstract | |
C1 | Author Address | |
RP | Reprint Address | |
CR | Cited References | |
TC | Times Cited | |
PY | Year | |
SC | Subject Category | |
UT | Unique Article Identifier | |
DB | Database |
for a complete list of field tags see: Field Tags used in bibliometrix
a character array containing a sequence of filenames coming from WoS, Scopus, Dimensions, Lens.org, and Pubmed. Alternatively, file
can be
an object resulting from an API query fetched from Dimensions, PubMed or OpenAlex databases:
a) | 'wos' | Clarivate Analytics WoS (in plaintext '.txt', Endnote Desktop '.ciw', or bibtex formats '.bib'); |
b) | 'scopus' | SCOPUS (exclusively in bibtex format '.bib'); |
c) | 'dimensions' | Digital Science Dimensions (in csv '.csv' or excel '.xlsx' formats); |
d) | 'lens' | Lens.org (in csv '.csv'); |
e) | 'pubmed' | an object of the class pubmedR (package pubmedR) containing a collection obtained from a query performed with pubmedR package; |
f) | 'dimensions' | an object of the class dimensionsR (package dimensionsR) containing a collection obtained from a query performed with dimensionsR package; |
g) | 'openalex' | OpenAlex .csv file; |
h) | 'openalex_api' | a data frame object returned by openalexR package, containing a collection of works resulting from a query fetched from OpenAlex database. |
is a character indicating the bibliographic database. dbsource
can be dbsource = c('cochrane','dimensions','generic','isi','openalex', 'pubmed','scopus','wos', 'lens')
. Default is dbsource = "isi"
.
is a character indicating the SCOPUS, Clarivate Analytics WoS, and other databases export file format. format
can be c('api', 'bibtex', 'csv', 'endnote','excel','plaintext', 'pubmed')
. Default is format = "plaintext"
.
is logical. If TRUE, the function will remove duplicated items checking by DOI and database ID.
# Example:
# Import and convert a Web of Science collection form an export file in plaintext format:
if (FALSE) {
files <- 'https://www.bibliometrix.org/datasets/wos_plaintext.txt'
M <- convert2df(file = files, dbsource = 'wos', format = "plaintext")
}
Run the code above in your browser using DataLab