convert2df: Import and Convert bibliographic export files and API objects.

Description

It converts a SCOPUS, Clarivate Analytics WoS, Dimensions, Lens.org, PubMed and COCHRANE Database export files or pubmedR and dimensionsR JSON/XML objects into a data frame, with cases corresponding to articles and variables to Field Tags as used in WoS.

Usage

convert2df(
  file,
  dbsource = "wos",
  format = "plaintext",
  remove.duplicates = TRUE
)

Value

a data frame with cases corresponding to articles and variables to Field Tags in the original export file.

I.e We have three files download from Web of Science in plaintext format, file will be:

file <- c("filename1.txt", "filename2.txt", "filename3.txt")

data frame columns are named using the standard Clarivate Analytics WoS Field Tag codify. The main field tags are:

`AU`		Authors
`TI`		Document Title
`SO`		Publication Name (or Source)
`JI`		ISO Source Abbreviation
`DT`		Document Type
`DE`		Authors' Keywords
`ID`		Keywords associated by SCOPUS or WoS database
`AB`		Abstract
`C1`		Author Address
`RP`		Reprint Address
`CR`		Cited References
`TC`		Times Cited
`PY`		Year
`SC`		Subject Category
`UT`		Unique Article Identifier
`DB`		Database

for a complete list of field tags see: Field Tags used in bibliometrix

Arguments

file

a character array containing a sequence of filenames coming from WoS, Scopus, Dimensions, Lens.org, and Pubmed. Alternatively, file can be an object resulting from an API query fetched from Dimensions, PubMed or OpenAlex databases:

a)	'wos'	Clarivate Analytics WoS (in plaintext '.txt', Endnote Desktop '.ciw', or bibtex formats '.bib');
b)	'scopus'	SCOPUS (exclusively in bibtex format '.bib');
c)	'dimensions'	Digital Science Dimensions (in csv '.csv' or excel '.xlsx' formats);
d)	'lens'	Lens.org (in csv '.csv');
e)	'pubmed'	an object of the class `pubmedR (package pubmedR)` containing a collection obtained from a query performed with pubmedR package;
f)	'dimensions'	an object of the class `dimensionsR (package dimensionsR)` containing a collection obtained from a query performed with dimensionsR package;
g)	'openalex'	OpenAlex .csv file;
h)	'openalex_api'	a data frame object returned by openalexR package, containing a collection of works resulting from a query fetched from OpenAlex database.

dbsource

is a character indicating the bibliographic database. dbsource can be dbsource = c('cochrane','dimensions','generic','isi','openalex', 'pubmed','scopus','wos', 'lens') . Default is dbsource = "isi".

format

is a character indicating the SCOPUS, Clarivate Analytics WoS, and other databases export file format. format can be c('api', 'bibtex', 'csv', 'endnote','excel','plaintext', 'pubmed'). Default is format = "plaintext".

remove.duplicates

is logical. If TRUE, the function will remove duplicated items checking by DOI and database ID.

Examples

Run this code


# Example:
# Import and convert a Web of Science collection form an export file in plaintext format:

if (FALSE) {
files <- 'https://www.bibliometrix.org/datasets/wos_plaintext.txt'

M <- convert2df(file = files, dbsource = 'wos', format = "plaintext")
}

Run the code above in your browser using DataLab