How To Use Scree Plot In Python To Explain PCA Variance

In this article, we will learn something interesting and useful. The topic we are going to learn: How to use scree plot in python? This will be very easy to learn. This is useful for PCA. PCA means Principal Component Analysis. A Scree plot is something that may be plotted in a graph or bar diagram. Let us learn about the scree plot in python.

A Scree plot is a graph useful to plot the eigenvectors. This plot is useful to determine the PCA(Principal Component Analysis) and FA (Factor Analysis). The screen plot has another name that is the scree test. In a scree plot, the eigenvalues are always in a downward curve. It orders the values in descending order that is from largest to smallest. 

Importance of Scree Plot in PCA.

A PCA is a reduction technique that transforms a high-dimensional data set into a new lower-dimensional data set. At the same time, preserving the maximum amount of information from the original data. And whenever dealing with PCA, we are encounter eigenvalues and eigenvectors.

A scree plot is a tool useful to check if the PCA working well on our data or not. The amount of variation is useful to create the Principal Components. It is represented as PC1, PC2, PC3, and so on. PC1 is useful to capture the topmost variation. PC2 is useful for another level, and it goes on. The advantage is that if PC1, PC2, and PC3 capture the most variation, we can ignore the rest.

Scree Plot Criterion

A method followed to determine the number of Principal Component is a graphical representation, and that is known as Scree plot. The Scree plot shows the eigenvalue for each Principal Component.

The graph shows eigenvalues on the y axis and no of factors on the x-axis. It is a downward curve. Most of the scree plot looks similar to each other in shapes, etc. This happens because PC1 gives most of the variation. PC2 gives moderate, and the rest of the others are a tiny part to look similar.

Steps to be followed in PCA

  • First, we have to standardize the data
  • Secondly, we have to calculate the covariance matrix
  • Thirdly, we have to find the eigenvalues and eigenvectors for that covariance matrix.
  • Fourthly, we have to sort that eigenvalues
  • Fifthly, transform the original matrix

Code

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
N=np.random.randn(6,9)
N=np.matrix(N.T)*np.matrix(N)
A,B,C=np.linalg.svd(N)
eigen_values=B**2/np.sum(B**2)
figure=plt.figure(figsize=(10,6))
sing_vals=np.arange(len(eigen_values)) + 1
plt.plot(sing_vals,eigen_values, 'ro-', linewidth=2)
plt.title('Scree Plot')
plt.xlabel('Principal Component')
plt.ylabel('Eigenvalue') 
plt.show() 

This is the code to plot the scree plot. In the x-axis, it shows the Principal component, and on the y-axis, it shows eigenvectors. This will display the downward curve.

Output

 scree plot in python

Must Read | Cracking The Python Autocorrelation Code

Applications of PCA

  • PCA is useful for image compression. We can resize the image as we want.
  • It is also useful in the food science field.
  • Using in Banking field and healthcare industries.
  • PCA is useful in the finance center.

PCA using sklearn library

import numpy as np
from sklearn import decomposition
from sklearn import datasets
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

First, importing necessary libraries. Here we are using Numpy, matplotlib, sklearn libraries. From sklearn importing datasets and decomposition.

np.random.seed(5)
cen = [[1, 1], [-1, -1], [1, -1]]
a = datasets.load_iris()
X = a.data
y = a.target

Next, giving the centers to plot as cen. In x-axis giving data, and in y-axis giving a target.

figure = plt.figure(1, figsize=(5,4))
plt.clf()
axis = Axes3D(figure, rect=[0, 0, 0.95, 1], elev=45, azim=132)
plt.cla()
PCA = decomposition.PCA(n_components=3)
PCA.fit(X)

Next setting the axis and figure size. Setting PCA components as 3.

X = PCA.transform(X)
y = np.choose(y, [1, 2, 0]).astype(float)
axis.scatter(X[:, 0], X[:, 1], X[:, 2], c=y, cmap=plt.cm.nipy_spectral,
           edgecolor='k')
axis.w_xaxis.set_ticklabels([])
axis.w_yaxis.set_ticklabels([])
axis.w_zaxis.set_ticklabels([])
plt.show()

After this getting all the axis using plt.show().

Output

PCA using sklearn library

Factor Analysis using PCA

Factor analysis is a regression method. We can apply to discover root causes or hidden factors that are present in the data set. But not observable. Using factor analysis, we can find latent variables that explain the pattern of observed behavior.

This explained the variance among the observed variables and condensed a set of observed variables into unobserved variables. This is called the factor.

Main Assumptions

  • The data set should not have out layers.
  • The sample set should be greater than the factor.
  • For example, if we have 10 sample sets then the factor maybe 3, then only the data set is used to calculate the FA.

Uses

  • Fraud detection
  • Spam detection

Difference between PCA and FA

PCAFA
PCA stands for Principal Component AnalysisFA stands for Factor Analysis
It is useful to transform the data from a larger to a smaller number of components.It is useful to understand the underlying “cause.”
It is a type of SVD (Singular Value Decomposition)It is also known as Common Factor Analysis.
It explains the cumulative variance in the predictors. It explains the correlation between the variables.
1. What are the plots in the scree plot graph?

The graph shows eigenvalues on the y-axis and no of factors on the x-axis.

2. Is the scree plot is an upward curve or a downward curve?

The scree plot always shows a downward curve.

3. What is the use of the scree plot?

A Scree plot is useful to determine the PCA(Principal Component Analysis) and FA (Factor Analysis).

4. What is another name for the scree plot?

Another name for the scree plot is the scree test.

In The End

We have seen about scree plot in python. The Scree plot is nothing. It is a simple graph. Here we have learned What is scree plot? The criterion of scree plot. Importance of scree plot in PCA and applications of PCA. We hope this article is easy and beneficial.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments