Learn R Programming

lessR (version 2.1.1)

dens: Data Based Density Curves with Color and Histogram

Description

Plots a normal density curve and/or a general density curve superimposed over a histogram, all estimated from the data. Also reports the Shapiro-Wilk normality test.

Usage

dens(x, dframe=mydata, 
         bw="nrd0", type=c("both", "general", "normal"),
         col.bg="ghostwhite", col.grid="grey90", col.bars="grey86",
         col.nrm="black", col.gen="black",
         col.fill.nrm=rgb(80,150,200, alpha=70, max=255), 
         col.fill.gen=rgb(250,210,230, alpha=70, max=255),
         bin.start=NULL, bin.width=NULL, text.out=TRUE,
         x.pt=NULL, xlab=NULL, main=NULL, y.axis=FALSE, ...)

color.density(...)

Arguments

x
Variable for which to construct the histogram and density plot.
dframe
Data frame that contains the variable of interest, default is mydata.
bw
Bandwidth of kernel estimation.
type
Type of density curve plotted. By default, both the general density and the normal density are plotted.
col.bg
Color of the plot background.
col.grid
Color of the grid lines.
col.bars
Default is to display the histogram in a light gray. To suppress, the histogram, specify a color of "transparent".
col.nrm
Color of the normal curve.
col.gen
Color of the general density curve.
col.fill.nrm
Fill color for the estimated normal curve, with a transparent blue as the default.
col.fill.gen
Fill color for the estimated general density curve, with a transparent light red as the default.
bin.start
Optional specified starting value of the bins.
bin.width
Optional specified bin width, which can be specified with or without a bin.start value.
text.out
If TRUE, then display text output in console.
x.pt
Value of the point on the x-axis for which to draw a unit interval around illustrating the corresponding area under the general density curve. Only applies when requesting type=general.
xlab
Label for x-axis.
main
Title of graph.
y.axis
Specifies is the y-axis, the density axis, should be included.
...
Other parameter values for graphics as defined processed by plot, including xlim, ylim, lwd and cex.lab, col.main, density,

Details

Results are based on the standard dnorm function and density R functions for estimating densities from data, as well as the hist function for calculating a histogram. Colors are provided by default and can also be specified.

The input data frame has the assumed name of mydata. If this data frame is named something different, then specify the name with the dframe option. Regardless of its name, the data frame need not be attached to reference the variable directly by its name without having to invoke the mydata$name notation. Any missing data values are removed and the effecive sample size and number of missing values reported.

By default, the histogram is displayed in a light gray, as a background for the normal and/or general estimated density curves, though this color can be changed. Using the alpha option for the rgb function, the density curves are by default plotted with transparent colors to facilitate comparison to the background histogram.

The default histogram is the same as the default provided by the hist function itself. The default can be modified with the bin.start and bin.width options. Use the hst function in this package for more control over the parameters of the histogram.

The limits for the axes are automatically calculated so as to provide sufficient space for the density curves and histogram, and should generally not require user intervention. Also, the curves are centered over the plot window so that the resulting density curves are symmetric even if the underlying histogram is not. The estimated normal curve is based on the corresponding sample mean and standard deviation.

If x.pt is specified, then type is set to general and y.axis set to TRUE.

A labels data frame named mylabels, obtained from the rad function, can list the label for some or all of the variables in the data frame that contains the data for the analysis. If this labels data frame exists, then the corresponding variable label is listed as the title of the resulting plot, unless a specific label is listed with the main option.

See Also

dnorm, density, hist, plot, rgb, shapiro.test.

Examples

Run this code
# generate 100 random normal data values
y <- rnorm(100)

# normal curve and general density curves superimposed over histogram
# all defaults
dens(y)

# suppress the histogram, leaving only the density curves
# specify x-axis label per the xlab option for the plot function
dens(y, col.bars="transparent", xlab="My Var")

# specify (non-transparent) colors for the curves,
# to make transparent, need alpha option for the rgb function
dens(y, col.nrm="darkgreen", col.gen="plum")

# display only the general estimated density
#  so do not display the estimated normal curve
# specify the bandwidth for the general density curve,
#  use the standard bw option for the density function
dens(y, type="general", bw=.6)

# display only the general estimated density and a corresponding
#  interval of unit width around x.pt
dens(y, type="general", x.pt=2)

# create data frame, mydata, to mimic reading data with rad function
# although data not attached, access the variable directly by its name
mydata <- data.frame(rnorm(100))
names(mydata) <- "X"
dens(X)

# variable of interest is in a data frame which is not the default mydata
# access the breaks variable in the R provided warpbreaks data set
# although data not attached, access the variable directly by its name
data(warpbreaks)
dens(breaks, dframe=warpbreaks)

Run the code above in your browser using DataLab