Learn R Programming

LaF (version 0.8.6)

laf_open_csv: Create a connection to a comma separated value (CSV) file.

Description

A connection to the file filename is created. Column types have to be specified. These are not determined automatically as for example read.csv does. This has been done to increase speed.

Usage

laf_open_csv(
  filename,
  column_types,
  column_names = paste("V", seq_len(length(column_types)), sep = ""),
  sep = ",",
  dec = ".",
  trim = FALSE,
  skip = 0,
  ignore_failed_conversion = FALSE
)

Value

Object of type laf. Values can be extracted from this object using indexing, and methods such as read_lines, next_block.

Arguments

filename

character containing the filename of the CSV-file

column_types

character vector containing the types of data in each of the columns. Valid types are: double, integer, categorical and string.

column_names

optional character vector containing the names of the columns.

sep

optional character specifying the field separator used in the file.

dec

optional character specifying the decimal mark.

trim

optional logical specifying whether or not white space at the end of factor levels or character strings should be trimmed.

skip

optional numeric specifying the number of lines at the beginning of the file that should be skipped.

ignore_failed_conversion

ignore (set to NA) fields that could not be converted.

Details

After the connection is created data can be extracted using indexing (as in a normal data.frame) or methods such as read_lines and next_block can be used to read in blocks. For processing the file in blocks the convenience function process_blocks can be used.

The CSV-file should not contain headers. Use the skip option to skip any headers.

In case of an incomplete line (at line with less columns than it should have): when the line is completely empty the reader stops at that point and considers that as the end of the file. In other cases a warning is issued and the remaining columns are considered empty. For character columns this results in an empty string for numeric columns a NA.

See Also

See read.csv for conventional access of CSV files. And detect_dm_csv to automatically determine the column types.

Examples

Run this code
# Create temporary filename
tmpcsv  <- tempfile(fileext="csv")

# Generate test data
ntest <- 10
column_types <- c("integer", "integer", "double", "string")
testdata <- data.frame(
    a = 1:ntest,
    b = sample(1:2, ntest, replace=TRUE),
    c = round(runif(ntest), 13),
    d = sample(c("jan", "pier", "tjores", "corneel"), ntest, replace=TRUE)
    )
# Write test data to csv file
write.table(testdata, file=tmpcsv, row.names=FALSE, col.names=FALSE, sep=',')

# Create LaF-object
laf <- laf_open_csv(tmpcsv, column_types=column_types)

# Read from file using indexing
first_column <- laf[ , 1]
first_row    <- laf[1, ]

# Read from file using blockwise operators
begin(laf)
first_block <- next_block(laf, nrows=2)
second_block <- next_block(laf, nrows=2)

# Cleanup
file.remove(tmpcsv)

Run the code above in your browser using DataLab