plottr
can be used to create plots for
all the variables in a dataframe or any one vector. The output is a list of
plots for each variable of class 'analyzerPlot'
plottr(
tb,
yvar = NULL,
xclasses = NULL,
yclass = NULL,
printall = F,
callasfactor = 1,
FUN1 = Cx,
FUN2 = Qx,
FUN3 = CxCy,
FUN4 = QxCy,
FUN5 = CxQy,
FUN6 = QxQy,
...
)
a data.frame or a vector. If yvar
argument is also passed,
then this should be a data.frame including the response variable (yvar
)
a string showing the response (dependent) variable name. Can be
NULL
if response variable is not present. Make sure that this variable
is present in the tb
a vector of length = ncol(tb)
with the data type of
all the columns. Can be NULL
, in such case function assigns a class
to each column. The values have to be either NULL, or a vector of
either 'factor'
or 'numeric'
. The order should be same as the
actual columns in tb
. In case when tb
is a vector, this can
be a vector of length 1.
class of response variable. Can be NULL
, but must have
value when yvar
is not NULL
. Value can be 'factor'
or
'numeric'
(logical) Whether user wants to show the plots. Setting this
as FALSE
will only returns a list of plots silently.
minimum unique values needed for x
to be
considered as numeric. See details for more information
an user-defined function for plotting 1 variables when the variable is Continuous. See details for more details on how to define these variables
same as FUN1 but for categorical variable
an user defined function for plotting 2 variables when both the independent variable (x) and dependent variable (y) are Continuous
same as FUN3, but when independent variable (x) is Categorical and dependent variable (y) is Continuous
same as FUN3, but when independent variable (x) is Continuous and dependent variable (y) is Categorical
same as FUN3, but when both the independent variable (x) and dependent variable (y) are Categorical
extra arguments passed to functions FUN1-FUN6
A list of plots for all the variables. Each plot will have the class
analyzerPlot
and can be displayed using plot()
. If
printall = TRUE
, then all plots will also be displayed.
This is a function which helps in understanding the data through multiple
visualizations. This works either for a data.frame having multiple variables
or a single x
variable or for a combination of predictor x
and
response y
variables. Based on
class of x
and y
different types of plots are automatically
generated.
Please note the following points:
Defining the class of variables: If yvar
is not NULL, then
yclass
has to be passed (which can be 'factor' for classification type
problem, or 'numeric' for regression). xclasses
stores the class of
all the variables in the dataframe in same order of columns. Note -
if yvar
is not NULL, then tb
has to be a data.frame with
at least 2 columns (including the yvar). In such
case xclasses should also have the class of yvar
although it is
also passed through yclass
. This can also be set as NULL, in
such case the function assigns a class based on the contents. If variable is
factor/character type, then xclasses
will have 'factor' as the
entry for that variable, else if x
is
numeric with number of unique values less than callasfactor
parameter value, then xclasses
will have 'factor', else 'numeric'.
DEFINING CUSTOM FUNCTIONS FOR THE PLOTS USING FUN1, FUN2, FUN3 ... FUN6:
Custom plots can be made using these functions passed as arguments. Following things must be followed while defining such functions:
the return plot must be of type 'grob' or 'gtables' or 'ggplot'.
Since these outputs will go to arrangeGrob
, make
sure the output plots are acceptable by arrangeGrob
function.
See code of CxCy
for sample.
not all 6 functions are required to be passed. Only pass those functions for which plots need to be changed.
FUN1 and FUN2 must have 3 parameters: dat (of type data.frame
for the data. Even
if there is only one column, it should be passed as a data.frame of one
column), xname name of column in dat and ... In addition
to these three, any number of additional parameters can be added.
Look into source of code of Cx
for sample.
FUN3, FUN4, FUN5 and FUN6
must have 4 parameters: dat (of type data.frame for the data.
Must have two columns for independent and dependent variables),
xname name of independent variable in dat
,
yname name of dependent variable in dat
and
... In addition to these four,
any number of additional parameters can be added.
Look into source of code of CxCy
for sample.
... must be added as an argument in all the functions.
To get a better idea, see the code for function CxCy
and Cx
Default plots: If the y
is NULL
, then histogram with density
is generated for numeric x
. Boxplot
is also shown in the same histogram using color and vertical lines. For
factor x
, a pie chart showing the distribution. This are the
univariate plots which can be modified by using the FUN1 and FUN2 arguments.
If y
is not
NULL
, then additional plots are added which can be modified by
using the FUN3, FUN4, FUN5, FUN6 arguments:
factor x
, factor y
: Crosstab with
heatmap (modified by using FUN6)
factor x
, numeric y
: histogram and
boxplot of y
for different values of x
(modified by using FUN4)
numeric x
, factor y
:
histogram and boxplot of x
for different
values of y
(modified by using FUN5)
numeric x
, numeric y
: Scatter
plot of x
and y
with rug plot included
(modified by using FUN3)
# NOT RUN {
# simple use for one variable
p <- plottr(mtcars$mpg)
# To display the plot
plot(p$x)
# With complete dataframe and assuming 'mpg' as a dependent variable
p <- plottr(mtcars, yvar = "mpg", yclass = "numeric")
plot(p$disp)
# }
Run the code above in your browser using DataLab