Introduction to Biplots
A biplot displays information about both samples and variables of a dataset. In a PCA biplot:
- Scores (principal component scores) represent the observations in the reduced-dimensional space.
- Loadings (principal component loadings) represent the variables in the reduced-dimensional space.
Basic Biplot with the stats Package
R’s base stats package includes functions for PCA and biplots. First, perform PCA using the prcomp function:
# Load the dataset
data(iris)
iris_data <- iris[, 1:4]
# Perform PCA
pca_result <- prcomp(iris_data, scale. = TRUE)
# Create a biplot
biplot(pca_result)
Output:
This command generates a biplot that combines both the PCA scores and loadings. The arrows represent the loadings of the variables, and the points represent the observations.
Enhanced Biplot with ggplot2 and ggfortify
For more customized and visually appealing biplots, you can use the ggplot2 package along with the ggfortify package.
install.packages("ggplot2")
install.packages("ggfortify")
library(ggplot2)
library(ggfortify)
# Create a biplot using ggplot2 and ggfortify
autoplot(pca_result, data = iris, colour = 'Species',
loadings = TRUE, loadings.label = TRUE, loadings.label.size = 3)
Output:
This command generates a biplot with enhanced customization options:
- colour = ‘Species’ colors the points based on the species of the iris dataset.
- loadings = TRUE adds arrows for the variable loadings.
- loadings.label = TRUE adds labels to the loadings.
- loadings.label.size = 3 sets the size of the loading labels.
Biplot with FactoMineR and factoextra
FactoMineR is a comprehensive package for multivariate data analysis, and factoextra provides functions to visualize the results.
install.packages("FactoMineR")
install.packages("factoextra")
library(FactoMineR)
library(factoextra)
# Perform PCA using FactoMineR
pca_result_fm <- PCA(iris_data, scale.unit = TRUE, graph = FALSE)
# Create a biplot using FactoMineR and factoextra
fviz_pca_biplot(pca_result_fm, repel = TRUE,
col.var = "blue", # Variables color
col.ind = iris$Species, # Individuals color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, ellipse.level = 0.95)
Output:
This command generates a biplot with the following features:
- repel = TRUE avoids text overlap.
- col.var = “blue” sets the color of the variable arrows.
- col.ind = iris$Species colors the points based on the species.
- palette defines the color palette for different species.
- addEllipses = TRUE adds confidence ellipses around the groups.
- ellipse.level = 0.95 sets the confidence level of the ellipses.
How to Create a Biplot in R
A biplot is a graphical representation that combines both the scores and loadings of a principal component analysis (PCA) in a single plot. This allows for the visualization of the relationships between variables and observations in a dataset. Creating a biplot in R can be done using several packages, including stats, ggplot2, and FactoMineR. This article will guide you through the steps to create a biplot in R, covering different methods and practical examples.