Rotation of principal components
To further facilitate interpretation of the relationships between variables and PCs, additional rotation can be applied to PCs to result in high factor loadings for a few variables and low factor loadings for the rest. In other words, a small number of variables will become highly correlated with each PC. The most common form of rotation is varimax rotation, a generalized form of which is implemented in the PCAmixdata
package for mixed data.
Here we will visualize the result of varimax rotation on relationships between variables and the first two PCs.
## Import library library(PCAmixdata) library(FactoMineR) library(factoextra) ## Import data df <- read.csv('https://github.com/nchelaru/data-prep/raw/master/telco_cleaned_renamed.csv') ## Drop the TotalCharges variable, as it is a product of MonthlyCharges and Tenure df <- within(df, rm('TotalCharges')) ## Split quantitative and qualitative variables split <- splitmix(df) ## FAMD res.pcamix <- PCAmix(X.quanti=split$X.quanti, X.quali=split$X.quali, rename.level=TRUE, graph=FALSE, ndim=25) ## Add "Churn" as a supplementary varible res.sup <- supvar(res.pcamix, X.quanti.sup = NULL, X.quali.sup = df[19], rename.level=TRUE) ## Apply varimax rotation to the first two PCs res.pcarot <- PCArot(res.sup, dim=2, graph=FALSE) ## Visualize factor loadings before rotation plot(res.sup, choice="sqload", coloring.var=TRUE, axes=c(1, 2), leg=TRUE, posleg="topleft", main="Variables before rotation", xlim=c(0,1), ylim=c(0,1))
## Visualize factor loadings after rotation plot(res.pcarot, choice="sqload", coloring.var=TRUE, axes=c(1, 2), leg=TRUE, posleg="topright", main="Variables after rotation", xlim=c(0,1), ylim=c(0,1))
We see higher factor loading of MonthlyCharges
and InternetService
for the rotated PC1, and Tenure
and Contract
for the rotated PC2 (as their projections are more closely aligned either axis). This indicates these four variables are the most important in accounting for overall variation in the entire dataset.
Interestingly, correlation between PC2 and Churn
has decreased after rotation, with increased factor loading for PC1.