Learn R Programming

factoextra (version 1.0.3)

fviz_mca: Visualize Multiple Correspondence Analysis

Description

Multiple Correspondence Analysis (MCA) is an extension of simple CA to analyse a data table containing more than two categorical variables. fviz_mca() provides ggplot2-based elegant visualization of MCA outputs from the R functions: MCA [in FactoMineR], and acm [in ade4]. Read more: Multiple Correspondence Analysis Essentials.
  • fviz_mca_ind(): Graph of individuals
  • fviz_mca_var(): Graph of variables
  • fviz_mca_biplot(): Biplot of individuals and variables
  • fviz_mca(): An alias of fviz_mca_biplot()

Usage

fviz_mca_ind(X, axes = c(1, 2), geom = c("point", "text"), label = "all", invisible = "none", labelsize = 4, pointsize = 2, repel = FALSE, habillage = "none", addEllipses = FALSE, ellipse.level = 0.95, ellipse.type = "norm", ellipse.alpha = 0.1, col.ind = "blue", col.ind.sup = "darkblue", alpha.ind = 1, shape.ind = 19, axes.linetype = "dashed", select.ind = list(name = NULL, cos2 = NULL, contrib = NULL), map = "symmetric", title = "Individuals factor map - MCA", jitter = list(what = "label", width = NULL, height = NULL), ...)
fviz_mca_var(X, axes = c(1, 2), geom = c("point", "text"), label = "all", invisible = "none", labelsize = 4, pointsize = 2, col.var = "red", alpha.var = 1, shape.var = 17, col.quanti.sup = "blue", col.quali.sup = "darkgreen", repel = FALSE, title = "Variable categories- MCA", select.var = list(name = NULL, cos2 = NULL, contrib = NULL), axes.linetype = "dashed", map = "symmetric", jitter = list(what = "label", width = NULL, height = NULL))
fviz_mca_biplot(X, axes = c(1, 2), geom = c("point", "text"), label = "all", invisible = "none", labelsize = 4, pointsize = 2, habillage = "none", addEllipses = FALSE, ellipse.level = 0.95, col.ind = "blue", col.ind.sup = "darkblue", alpha.ind = 1, col.var = "red", alpha.var = 1, col.quanti.sup = "blue", col.quali.sup = "darkgreen", repel = FALSE, shape.ind = 19, shape.var = 17, axes.linetype = "dashed", select.var = list(name = NULL, cos2 = NULL, contrib = NULL), select.ind = list(name = NULL, cos2 = NULL, contrib = NULL), map = "symmetric", arrows = c(FALSE, FALSE), title = "MCA factor map - Biplot", jitter = list(what = "label", width = NULL, height = NULL), ...)
fviz_mca(X, ...)

Arguments

X
an object of class MCA [FactoMineR], acm [ade4].
axes
a numeric vector of length 2 specifying the dimensions to be plotted.
geom
a text specifying the geometry to be used for the graph. Allowed values are the combination of c("point", "arrow", "text"). Use "point" (to show only points); "text" to show only labels; c("point", "text") or c("arrow", "text") to show both types.
label
a text specifying the elements to be labelled. Default value is "all". Allowed values are "none" or the combination of c("ind", "ind.sup","var", "quali.sup", "quanti.sup"). "ind" can be used to label only active individuals. "ind.sup" is for supplementary individuals. "var" is for active variable categories. "quali.sup" is for supplementary qualitative variable categories. "quanti.sup" is for quantitative supplementary variables.
invisible
a text specifying the elements to be hidden on the plot. Default value is "none". Allowed values are the combination of c("ind", "ind.sup","var", "quali.sup", "quanti.sup").
labelsize
font size for the labels
pointsize
the size of points
repel
a boolean, whether to use ggrepel to avoid overplotting text labels or not.
habillage
an optional factor variable for coloring the observations by groups. Default value is "none". If X is an MCA object from FactoMineR package, habillage can also specify the index of the factor variable in the data.
addEllipses
logical value. If TRUE, draws ellipses around the individuals when habillage != "none".
ellipse.level
the size of the concentration ellipse in normal probability.
ellipse.type
Character specifying frame type. Possible values are 'convex' or types supporeted by stat_ellipse including one of c("t", "norm", "euclid").
ellipse.alpha
Alpha for ellipse specifying the transparency level of fill color. Use alpha = 0 for no fill color.
col.ind, col.var
color for individuals and variables, respectively. Possible values include also : "cos2", "contrib", "coord", "x" or "y". In this case, the colors for individuals/variables are automatically controlled by their qualities ("cos2"), contributions ("contrib"), coordinates (x^2 + y^2 , "coord"), x values("x") or y values("y"). To use automatic coloring (by cos2, contrib, ....), make sure that habillage ="none".
col.ind.sup
color for supplementary individuals
alpha.ind, alpha.var
controls the transparency of individual and variable colors, respectively. The value can variate from 0 (total transparency) to 1 (no transparency). Default value is 1. Possible values include also : "cos2", "contrib", "coord", "x" or "y". In this case, the transparency for individual/variable colors are automatically controlled by their qualities ("cos2"), contributions ("contrib"), coordinates (x^2 + y^2 , "coord"), x values("x") or y values("y"). To use this, make sure that habillage ="none".
shape.ind, shape.var
point shapes of individuals and variables.
axes.linetype
linetype of x and y axes.
select.ind, select.var
a selection of individuals/variables to be drawn. Allowed values are NULL or a list containing the arguments name, cos2 or contrib:
  • name is a character vector containing individuals/variables to be drawn
  • cos2 if cos2 is in [0, 1], ex: 0.6, then individuals/variables with a cos2 > 0.6 are drawn. if cos2 > 1, ex: 5, then the top 5 individuals/variables with the highest cos2 are drawn.
  • contrib if contrib > 1, ex: 5, then the top 5 individuals/variables with the highest contrib are drawn
map
character string specifying the map type. Allowed options include: "symmetric", "rowprincipal", "colprincipal", "symbiplot", "rowgab", "colgab", "rowgreen" and "colgreen". See details
title
the title of the graph
jitter
a parameter used to jitter the points in order to reduce overplotting. It's a list containing the objects what, width and height (i.e jitter = list(what, width, height)).
  • what: the element to be jittered. Possible values are "point" or "p"; "label" or "l"; "both" or "b"
  • width: degree of jitter in x direction
  • height: degree of jitter in y direction
...
Arguments to be passed to the function fviz_mca_biplot()
col.quanti.sup, col.quali.sup
a color for the quantitative/qualitative supplementary variables.
arrows
Vector of two logicals specifying if the plot should contain points (FALSE, default) or arrows (TRUE). First value sets the rows and the second value sets the columns.

Value

a ggplot2 plot

Details

The default plot of MCA is a "symmetric" plot in which both rows and columns are in principal coordinates. In this situation, it's not possible to interpret the distance between row points and column points. To overcome this problem, the simplest way is to make an asymmetric plot. This means that, the column profiles must be presented in row space or vice-versa. The allowed options for the argument map are:
  • "rowprincipal" or "colprincipal": asymmetric plots with either rows in principal coordinates and columns in standard coordinates, or vice versa. These plots preserve row metric or column metric respectively.
  • "symbiplot": Both rows and columns are scaled to have variances equal to the singular values (square roots of eigenvalues), which gives a symmetric biplot but does not preserve row or column metrics.
  • "rowgab" or "colgab": Asymmetric maps, proposed by Gabriel & Odoroff (1990), with rows (respectively, columns) in principal coordinates and columns (respectively, rows) in standard coordinates multiplied by the mass of the corresponding point.
  • "rowgreen" or "colgreen": The so-called contribution biplots showing visually the most contributing points (Greenacre 2006b). These are similar to "rowgab" and "colgab" except that the points in standard coordinates are multiplied by the square root of the corresponding masses, giving reconstructions of the standardized residuals.

See Also

get_mca, fviz_pca, fviz_ca, fviz_mfa, fviz_hmfa

Examples

Run this code
# Multiple Correspondence Analysis
# ++++++++++++++++++++++++++++++
# Install and load FactoMineR to compute MCA
# install.packages("FactoMineR")
library("FactoMineR")
data(poison)
poison.active <- poison[1:55, 5:15]
head(poison.active)
res.mca <- MCA(poison.active, graph=FALSE)

# Graph of individuals
# +++++++++++++++++++++

# Default Plot
# Color of individuals: col.ind = "steelblue"
fviz_mca_ind(res.mca, col.ind = "steelblue")

# 1. Control automatically the color of individuals 
   # using the "cos2" or the contributions "contrib"
   # cos2 = the quality of the individuals on the factor map
# 2. To keep only point or text use geom = "point" or geom = "text".
# 3. Change themes: http://www.sthda.com/english/wiki/ggplot2-themes

fviz_mca_ind(res.mca, col.ind = "cos2", repel = TRUE)+
theme_minimal()

## Not run:      
# # You can also control the transparency 
# # of the color by the cos2
# fviz_mca_ind(res.mca, alpha.ind="cos2") +
#      theme_minimal()  
# ## End(Not run)
     
# Color individuals by groups, add concentration ellipses
# Remove labels: label = "none".
grp <- as.factor(poison.active[, "Vomiting"])
p <- fviz_mca_ind(res.mca, label="none", habillage=grp,
       addEllipses=TRUE, ellipse.level=0.95)
print(p)
      
    
# Change group colors using RColorBrewer color palettes
# Read more: http://www.sthda.com/english/wiki/ggplot2-colors
p + scale_color_brewer(palette="Dark2") +
    scale_fill_brewer(palette="Dark2") +
     theme_minimal()
     
# Change group colors manually
# Read more: http://www.sthda.com/english/wiki/ggplot2-colors
p + scale_color_manual(values=c("#999999", "#E69F00"))+
 scale_fill_manual(values=c("#999999", "#E69F00"))+
 theme_minimal()  
             
             
# Select and visualize some individuals (ind) with select.ind argument.
 # - ind with cos2 >= 0.4: select.ind = list(cos2 = 0.4)
 # - Top 20 ind according to the cos2: select.ind = list(cos2 = 20)
 # - Top 20 contributing individuals: select.ind = list(contrib = 20)
 # - Select ind by names: select.ind = list(name = c("44", "38", "53",  "39") )
 
# Example: Select the top 40 according to the cos2
fviz_mca_ind(res.mca, select.ind = list(cos2 = 20))

 
# Graph of variable categories
# ++++++++++++++++++++++++++++
# Default plot: use repel = TRUE to avoid overplotting
fviz_mca_var(res.mca, col.var = "#FC4E07")+
theme_minimal()

# Control variable colors using their contributions
# use repel = TRUE to avoid overplotting
fviz_mca_var(res.mca, col.var = "contrib")+
 scale_color_gradient2(low="white", mid="blue", 
           high="red", midpoint=2, space = "Lab") +
 theme_minimal()      
        
   
# Select variables with select.var argument
   # You can select by contrib, cos2 and name 
   # as previously described for ind
# Select the top 10 contributing variables
fviz_mca_var(res.mca, select.var = list(contrib = 10))
    
# Biplot
# ++++++++++++++++++++++++++
grp <- as.factor(poison.active[, "Vomiting"])
fviz_mca_biplot(res.mca, repel = TRUE, col.var = "#E7B800",
 habillage = grp, addEllipses = TRUE, ellipse.level = 0.95)+
 theme_minimal()
 
 ## Not run: 
# # Keep only the labels for variable categories: 
# fviz_mca_biplot(res.mca, label ="var")
# 
# # Keep only labels for individuals
# fviz_mca_biplot(res.mca, label ="ind")
# 
# # Hide variable categories
# fviz_mca_biplot(res.mca, invisible ="var")
# 
# # Hide individuals
# fviz_mca_biplot(res.mca, invisible ="ind")
# 
# # Control automatically the color of individuals using the cos2
# fviz_mca_biplot(res.mca, label ="var", col.ind="cos2") +
#        theme_minimal()
#        
# # Change the color by groups, add ellipses
# fviz_mca_biplot(res.mca, label="var", col.var ="blue",
#    habillage=grp, addEllipses=TRUE, ellipse.level=0.95) + 
#    theme_minimal() 
#                
# # Select the top 30 contributing individuals
# # And the top 10 variables
# fviz_mca_biplot(res.mca,  
#                select.ind = list(contrib = 30),
#                select.var = list(contrib = 10)) 
# ## End(Not run)

Run the code above in your browser using DataLab