read.xlsx: Read Excel File

Description

This function calls the read_xlsx() function in the readxl package by Hadley Wickham and Jennifer Bryan (2019) to read an Excel file (.xlsx).

Usage

read.xlsx(file, sheet = NULL, header = TRUE, range = NULL,
          coltypes = c("skip", "guess", "logical", "numeric", "date", "text", "list"),
          na = "", trim = TRUE, skip = 0, nmax = Inf, guessmax = min(1000, nmax),
          progress = readxl::readxl_progress(), name.repair = "unique",
          as.data.frame = TRUE, check = TRUE)

Value

Returns a data frame or tibble.

Arguments

file: a character string indicating the name of the Excel data file with or without file extension '.xlsx', e.g., "My_Excel_Data.xlsx" or "My_Excel_Data".
sheet: a character string indicating the name of a sheet or a numeric value indicating the position of the sheet to read. By default the first sheet will be read.
header: logical: if TRUE (default), the first row is used as column names, if FALSE default names are used. A character vector giving a name for each column can also be used. If coltypes as a vector is provided, colnames can have one entry per column, i.e. have the same length as coltypes, or one entry per unskipped column.
range: a character string indicating the cell range to read from, e.g. typical Excel ranges like "B3:D87", possibly including the sheet name like "Data!B2:G14". Interpreted strictly, even if the range forces the inclusion of leading or trailing empty rows or columns. Takes precedence over skip, nmax and sheet.
coltypes: a character vector containing one entry per column from these options "skip", "guess", "logical", "numeric", "date", "text" or "list". If exactly one coltype is specified, it will be recycled. By default (i.e., coltypes = NULL) coltypes will be guessed. The content of a cell in a skipped column is never read and that column will not appear in the data frame output. A list cell loads a column as a list of length 1 vectors, which are typed using the type guessing logic from coltypes = NULL, but on a cell-by-cell basis.
na: a character vector indicating strings to interpret as missing values. By default, blank cells will be treated as missing data.
trim: logical: if TRUE (default), leading and trailing whitespace will be trimmed.
skip: a numeric value indicating the minimum number of rows to skip before reading anything, be it column names or data. Leading empty rows are automatically skipped, so this is a lower bound. Ignored if the argument range is specified.
nmax: a numeric value indicating the maximum number of data rows to read. Trailing empty rows are automatically skipped, so this is an upper bound on the number of rows in the returned data frame. Ignored if the argument range is specified.
guessmax: a numeric value indicating the maximum number of data rows to use for guessing column types.
progress: display a progress spinner? By default, the spinner appears only in an interactive session, outside the context of knitting a document, and when the call is likely to run for several seconds or more.
name.repair: a character string indicating the handling of column names. By default, the function ensures column names are not empty and are unique.
as.data.frame: logical: if TRUE (default), function returns a regular data frame; if FALSE function returns a tibble.
check: logical: if TRUE (default), argument specification is checked.

Author

Hadley Wickham and Jennifer Bryan

References

Wickham H, Miller E, Smith D (2023). readxl: Read Excel Files. R package version 1.4.3. https://CRAN.R-project.org/package=readxl

Examples

Run this code

if (FALSE) {
# Example 1: Read Excel file (.xlsx)
read.xlsx("data.xlsx")

# Example 1: Read Excel file (.xlsx), use default names as column names
read.xlsx("data.xlsx", header = FALSE)

# Example 2: Read Excel file (.xlsx), interpret -99 as missing values
read.xlsx("data.xlsx", na = "-99")

# Example 3: Read Excel file (.xlsx), use x1, x2, and x3 as column names
read.xlsx("data.xlsx", header = c("x1", "x2", "x3"))

# Example 4: Read Excel file (.xlsx), read cells A1:B5
read.xlsx("data.xlsx", range = "A1:B5")

# Example 5: Read Excel file (.xlsx), skip 2 rows before reading data
read.xlsx("data.xlsx", skip = 2)

# Example 5: Read Excel file (.xlsx), return a tibble
read.xlsx("data.xlsx", as.data.frame = FALSE)
}

Run the code above in your browser using DataLab