Learn R Programming

CITAN (version 2015.12-2)

Scopus_ReadCSV: Import bibliography entries from a CSV file.

Description

Reads bibliography entries from a UTF-8 encoded CSV file.

Usage

Scopus_ReadCSV(filename, stopOnErrors = TRUE, dbIdentifier = "Scopus",
  alternativeIdPattern = "^.*\\id=|\\&.*$", ...)

Arguments

filename

the name of the file which the data are to be read from, see read.csv.

stopOnErrors

logical; TRUE to stop on all potential parse errors or just warn otherwise.

dbIdentifier

character or NA; database identifier, helps detect parse errors, see above.

alternativeIdPattern

character; regular expression used to extract AlternativeId, NA to get the id as is,

...

further arguments to be passed to read.csv.

Value

A data.frame containing the following 11 columns:

Authors Author name(s), comma-separated, surnames first.
Title Document title.
Year Year of publication.
AlternativeId Unique document identifier.
SourceTitle Title of the source containing the document.
Volume Volume.
Issue Issue.
PageStart Start page; numeric.
PageEnd End page; numeric.
Citations Number of citations; numeric.
DocumentType Type of the document; see above.

The object returned may be imported into a local bibliometric storage via lbsImportDocuments.

Details

The read.csv function is used to read the bibliography. You may therefore freely modify its behavior by passing further arguments (...), see the manual page of read.table for details.

The CSV file should consist at least of the following columns.

  1. Authors: Author name(s) (surname first; multiple names are comma-separated, e.g. “Smith John, Nowak G. W.”),

  2. Title: Document title,

  3. Year: Year of publication,

  4. Source.title: Source title, e.g. journal name,

  5. Volume: Volume number,

  6. Issue: Issue number,

  7. Page.start: Start page number,

  8. Page.end: End page number,

  9. Cited.by: Number of citations received,

  10. Link: String containing unique document identifier, by default of the form ...id=UNIQUE_ID&... (see alternativeIdPattern parameter),

  11. Document.Type: Document type, one of: “Article”, “Article in Press”, “Book”, “Conference Paper”, “Editorial”, “Erratum”, “Letter”, “Note”, “Report”, “Review”, “Short Survey”, or NA (other categories are treated as NAs),

  12. Source: Data source identifier, must be the same as the dbIdentifier parameter value. It is used for parse errors detection.

The CSV file to be read may, for example, be created by SciVerse Scopus (Export format=comma separated file, .csv (e.g. Excel), Output=Complete format or Citations only). Note that the exported CSV file sometimes needs to be corrected by hand (wrong page numbers, single double quotes in character strings instead of two-double quotes etc.). We suggest to make the corrections in a “Notepad”-like application (in plain text). The function tries to indicate line numbers causing potential problems.

See Also

Scopus_ASJC, Scopus_SourceList, lbsConnect, Scopus_ImportSources, read.table, lbsImportDocuments

Examples

Run this code
# NOT RUN {
conn <- lbsConnect("Bibliometrics.db");
## ...
data <- Scopus_ReadCSV("db_Polish_MATH/Poland_MATH_1987-1993.csv");
lbsImportDocuments(conn, data, "Poland_MATH");
## ...
lbsDisconnect(conn);
# }

Run the code above in your browser using DataLab