rad
, rad.brief
, rad.labels
, rad.both
, rad2
Reads the contents of the specified data file and/or variable labels into an R data frame. The format of the file can be standard csv
data file, a fixed width formatted data file, or native SPSS or R data file. The data are read into a data frame called mydata
. Any optional variable labels are read into a data frame called mylabels
. Identify the file by either browsing for the file on the local computer system with rad()
, or as indicated by an argument as a character string in the form of a path name or a web URL. The function also lists the first and last three rows of data as well as the variable names and the dimensions of the resulting data frame and the data type for each variable. In addition, perform an analysis of missing data, listing the number of missing values for each variable and for each observation.
Also see the lessR
function corRead
and its alternate form rad.cor
to read a correlation matrix.
Read(ref=NULL, brief=FALSE, show.R=FALSE, attach=FALSE,
n.mcut=1, miss.show=30, miss.zero=FALSE, miss.matrix=FALSE,
format=c("csv", "SPSS", "R"), data=TRUE, labels=FALSE,
missing="", max.lines=30, widths=NULL, ...)rad(...)
rad.brief(ref=NULL, brief=TRUE, ...)
rad.labels(ref=NULL, data=FALSE, labels=TRUE, ...)
rad.both(ref=NULL, data=TRUE, labels=TRUE, ...)
rad2(ref=NULL, sep=";", dec=",", ...)
http://
.lessR
output, albeit without
the lessR
formatting.mydata
by default.n.mcut
.csv
file, and as
option can be an SPSS sav
file, which also reads the variable labels if present.
Also set to TRUE
if the file to be read has a fTRUE
, then read data, otherwise only variable labels are read.TRUE
, then the second row of information of a csv
data file,
after the variable names in the first row, consists of the variable labels. Data begins
on the third row.read.table
function, such as sep, row.names and header.Read
reads csv
data files. One way to create a csv data file is by entering the data into a text editor. A more structured method is to use a worksheet application such as MS Excel, LibreOffice Calc. Place the variable names in the first row of the worksheet. Each column of the worksheet contains the data for the corresponding variable. Each subsequent row contains the data for a specific observation, such as for a person or a company. All numeric data in the worksheet should be displayed in the General format, so that the only non-digit character for a numeric data value is a decimal point. The General format removes all dollar signs and commas, for example, leaving only the pure number, stripped of these extra characters which R will not properly read as part of a numeric data value.
To create the csv file from a standard worksheet application such as Microsoft Excel or LibreOffice Calc, first convert any numeric data to general format to remove characters such as dollar signs and commas, and then under the File option, do a Save As and choose the csv format.
Invoke the sep=""
option to read tab-delimited data. Do help(read.table)
to view the other options that can also be implemented from Read
.
MECHANICS
Specify the file as with the Read
function for reading the data into a data frame. If no arguments are passed to the function, then interactively browse for the file. Or, enclose within quotes a full path name or a URL for reading the labels on the web.
Given a csv data file, read the data into an R data frame called mydata
with Read
. Because Read
calls the standard R function read.csv
, which just provides a wrapper for read.table
, the usual options that work with read.table
, such as row.names
also can be passed through Read
.
SPSS DATA
To read data in the SPSS .sav
format, Read
calls the read.spss
function from the foreign
package. If the file has a file type of .sav
, that is, the file specification ends in .sav
, then the format
is automatically set to "SPSS"
. To invoke this option for a relevant data file of any file type, explicitly specify format="SPSS"
.
R DATA
By convention only, data files in native R format have a file type of .rda
. To read a native R data file, if the file type is .rda
, the format
is automatically set to "R"
. To invoke this option for a relevant data file of any file type, explicitly specify format="R"
. Create a native R data file by saving the current data frame, usually mydata
, with the lessR
function Write
. When read back into a working R session, the data is restored as the complete data frame of the same name from which it was saved.
FIXED WIDTH FORMATTED DATA
Sometimes the width of the columns are the same for all the data values of a variable, such as a data file of Likert scale responses from 1 to 5 on a 50 survey items such that the data consist of 50 columns with no spaces or other delimiter to separate adjacent data values. To read this data, based upon the R function read.fwf
, invoke the widths
option of that function.
MISSING DATA
By default, Read
provides a list of each variable and each row with the display of the number of associated missing values, indicated by the standard R missing value code NA. When reading the data, Read
automatically sets any empty values as missing. Note that this is different from the R default in read.table
in which an empty value for character string variables are treated as a regular data value. Any other valid value for any data type can be set to missing as well with the missing
option. To mimic the standard R default for missing character values, set missing=NA
.
To not list the variable name or row name of variables or rows without missing data, invoke the miss.zero=FALSE
option, which can appreciably reduce the amount of output for large data sets. To view the entire data table in terms of 0's and 1's for non-missing and missing data, respectively, invoke the miss.matrix=TRUE
option.
VARIABLE LABELS
Standard R does not provide for variable labels, but lessR
provides for a data frame called mylabels
which stores variable labels. A labels data frame can list the label for some or all of the variables in the data frame that contains the data for the analysis. One way to enter the variable labels is to read them from their own file with Read
with labels=TRUE
and data=FALSE
, or with the short form {rad.labels}
. Another way is to include the labels directly in the data file, as the second row of information, after the variable names in the first row and before the first row of data, in the third row of the file. To to this, set labels=TRUE
, or, equivalently, invoke the short form rad.both
, which reads both the variable labels and the data from the same file. The web survey application Qualtrics downloads csv
files in this format.
The lessR
functions that provide analysis, such as hst
for a histogram, automatically include the variable labels in their output, such as the title of a graph. Standard R functions can also use these variable labels by invoking the label
function, such as setting main=label(I4)
to put the variable label for a variable named I4 in the title of a graph.
For a file that contains only labels, each row of the file, including the first row, consists of the variable name, a comma, and then the label, that is, standard csv
format such as obtained with the csv
optiion from a standard worksheet application such as Microsoft Excel or LibreOffice Calc. Not all variables in the data frame that contains the data, mydata
by default, need have a label, and the variables with their corresponding labels can be listed in any order. An example follows.
I2,"This instructor presents material in a clear and organized manner." I4,"Overall, this instructor was highly effective in this class." I1,"This instructor has command of the subject." I3,"This instructor relates course materials to real world situations." The quotes here are not needed because there are no commas in the enclosed character strings.
read.csv
,read.spss
,read.fwf
,
attach
, head
, tail
, str
,
corRead
.# remove the # sign before each of the following Read statements to run
# to browse for a csv data file on the computer system, invoke Read with
# the ref argument empty, which, in turn, invokes read.csv(file.choose()),
# and then automatically invokes the attach, head and tail statements
# Read()
# short name
# rad()
# same as above, but include standard read.csv options to indicate
# no variable names in first row of the csv data file
# and then provide the names
# also indicate that the first column is an ID field
# Read(header=FALSE, col.names=c("X", "Y"), row.names=1)
# read a csv data file from the web
# Read("http://web.pdx.edu/~gerbing/data/twogroup.csv")
# read a csv data file with -99 and XXX set to missing
# Read(missing=c(-99, "XXX"))
# do not display any output
# Read.brief()
# read tab-delimited (or any other white-space) data
# Read(sep="")
# read variable labels only, no data
# Read.labels()
# read data and variable labels
# Read.both()
# read a data file that consists of a
# 5 column ID field, 2 column Age field
# and 75 single columns of data, no spaces between columns
# name the variables with lessR function: to
# the variable names are Q01, Q02, ... Q74, Q75
# Read(widths=c(5,2,rep(1,75)), col.names=c("ID", "Age", to("Q", 75)))
Run the code above in your browser using DataLab