We will learn how to find the standard deviation of the numpy array. we can find the standard deviation of the numpy array using numpy.std() function. we will learn the calculation of this in a deep, thorough explanation of every part of the code with examples.
What is Numpy Standard Deviation?
Numpy is a toolkit that helps us in working with numeric data. It contains a set of tools for creating a data structure called a Numpy array. It is basically a row and column grid of numbers.
Standard Deviation: A standard deviation is a statistic that measures the amount of variation in a dataset relative to its mean and is calculated as the square root of the variance. It is calculated by determining each data point’s deviation relative to the mean.
Where,
- SD = standard Deviation
- x = Each value of array
- u = total mean
- N = numbers of values
The numpy module in python provides various functions in which one is numpy.std(). It is used to compute the standard deviation along the specified axis. This function returns the standard deviation of the numpy array elements. The square root of the average square deviation (known as variance) is called the standard deviation.
Standard Deviation = sqrt(mean(abs(x-x.mean( ))**2
Syntax of Numpy Standard Deviation
numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<class numpy._globals._NoValue>)
Parameters of Numpy Standard Deviation
- a: array_like – this parameter is used to calculate the standard deviation of the array elements.
- axis: None, int, or tuple of ints – It is optional to calculate the standard deviation. In this, we define the axis along which the standard deviation is calculated. By default, it calculates the standard deviation of the flattened array. If we have a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before.
- dtype: data_type – It is also optional in the calculation of standard deviation. By default, the data type is float64 for integer type arrays, and the float type array will be just the same as the array type.
- out: ndarray – It is also optional in the calculation of standard deviation. This parameter is used as the alternative output array in which the result is to be placed. It must have the same shape as the expected output, but we can typecast if necessary.
- ddof: int – It is also optional in the calculation of standard deviation. This defines the delta degree of freedom. The divisor which is used in calculations is N-ddof, where N represents the no. of elements. By default, ddof is zero.
- keepdims: bool – It is optional. When the value is true, it will leave the reduced axis as dimensions with size one in the resultant. When the default value is passed, it will allow the non-default values to pass via the mean method of sub-classes of ndarray, but the keepdims will not pass.
Returns
It will return the new array that contains the standard deviation. If the ‘out’ parameter is not set to ‘None,’ then it will return the output array’s reference.
Examples of Numpy Standard Deviation
1. Numpy.std() – 1D array
import numpy as np
Arr = np.array([2, 1, 7])
result = np.std(Arr)
print("arr : ",Arr)
print("SD : ",result)
Output:
arr : [2 1 7]
SD : 2.6246692913372702
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created an array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the std() function’s returned value. We have passed the array ‘arr‘ in the function. Lastly, we have printed the value of the result.
2. Numpy.std() using dtype=float32
import numpy as np
Arr = [8,9,8,2,8,2]
result = np.std(Arr)
print("Arr : ", Arr)
print("SD: ", result)
print ("More precision value with float32")
print("SD: ", np.std(Arr, dtype = np.float32))
Output:
Arr : [8, 9, 8, 2, 8, 2]
SD: 2.9674156357941426
More precision value with float32
SD: 2.9674158
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created an array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the std() function’s returned value. We have passed the array ‘arr’ in the function. Fourthly, we have printed the value of the result. Then we have used the type parameter for the more precise value of standard deviation, which is set to dtype = np.float32. And lastly, we have printed the output.
3. Numpy.std() using dtype=float64
import numpy as np
Arr = [8,9,8,2,8,2]
result = np.std(Arr)
print("Arr : ", Arr)
print("SD: ", result)
print ("More accurate value with float64")
print("SD: ", np.std(Arr, dtype = np.float64))
Output:
Arr : [8, 9, 8, 2, 8, 2]
SD: 2.9674156357941426
More accurate value with float64
SD: 2.9674156357941426
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created an array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the std() function’s returned value. We have passed the array ‘arr’ in the function. Fourthly, we have printed the value of the result. Then we have used the type parameter for the more accurate value of standard deviation, which is set to dtype = np.float64. And lastly, we have printed the output.
4. Numpy.std() – 2D Array
import numpy as np
arr = np.array([[2,4,6,8],[2,6,9,7]])
print("Array : ",arr)
result = np.std(arr)
print("SD : ",result)
Output:
Array : [[2 4 6 8]
[2 6 9 7]]
SD : 2.449489742783178
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created a 2D-array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the std() function’s returned value. We have passed the array ‘arr’ in the function. Lastly, we have printed the value of the result.
5. Using axis=0 on 2D-array to find Numpy Standard Deviation
import numpy as np
arr = np.array([[2,4,6,8],[2,6,9,7]])
print("Array : ",arr)
result = np.std(arr, axis=0)
print("SD : ",result)
Output:
Array : [[2 4 6 8]
[2 6 9 7]]
SD : [0. 1. 1.5 0.5]
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created a 2D-array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the std() function’s returned value. We have passed the array ‘arr’ in the function in which we have used one more parameter, i.e., axis=0. Lastly, we have printed the value of the result.
6. Using axis=1 in 2D-array to find Numpy Standard Deviation
import numpy as np
arr = np.array([[2,4,6,8],[2,6,9,7]])
print("Array : ",arr)
result = np.std(arr, axis=1)
print("SD : ",result)
Output:
Array : [[2 4 6 8]
[2 6 9 7]]
SD : [2.23606798 2.54950976]
Explanation:
Here firstly, we have imported numpy with alias name as np. Secondly, We have created a 2D-array ‘arr’ via array() function. Thirdly, We have declared the variable ‘result’ and assigned the returned value of the std() function. we have passed the array ‘arr’ in the function in which we have used one more parameter i.e., axis=1. Lastly, we have printed the value of the result.
Numpy standard deviation and mean
One can find out the standard deviation and mean of a numpy array with numpy inbuilt functions like numpy.mean() and numpy.std().
For example:
import numpy as np
arr= np.arange(5)
Mean = np.mean(arr)
print(Mean)
Stddev = np.std(arr)
print(Stddev)
Numpy standard deviation and Variance
We can find the standard deviation and variance of numpy array using the inbuilt functions or the formulae.
For example:
import numpy as np
arr= np.arange(6)
Variance = np.var(arr)
print(Mean)
Stddev = np.std(arr)
print(Stddev)
Or we can also use:
Std1 = np.sqrt(np.mean((array - np.mean(array)) ** 2))
print("\nstd: ", Std1)
Var1= np.mean((array - np.mean(array)) ** 2)
print("\nvariance: ", Var1)
Numpy convolve standard deviation
We can convolve a numpy array using the convolve function. Its syntax is:
np.convolve(a, b, mode)
The mode is specified by the user. The mode can be full, same, and valid. It is full by default. Here , a means the first array while b refers to the 2nd array on which convolution has to be performed.
Full mode implies size of resultant array will be size of a+ size of b -1, same mode gives the maximum length and valid mode gives those elements which exist without zero padding.
Numpy standard deviation ignore Nan
numpy.nanstd() function excludes the NaN values and then calculates the standard deviation of numpy array.
Example:
Ans= np.nanstd(myarr)
It can have other parameters also like
axis, dtype, out, ddof and keepdims.
Axis specifies the axis on which you want to calculate the standard deviation of the array, dtype means the necessary data type and out gives the other array for storing your answer. Delta Degrees of Freedom, ddof is 0 generally.
FAQs
Stdev uses n-1 degrees of freedom where n stands for the number of elements in the dataset and std() uses n degrees of freedom. So your results can vary a little based on what function you use.
Must Read
Conclusion
In this tutorial, we have learned in detail about the calculation of standard deviation using the numpy.std() function. We have also seen all the examples in details to understand the concept better.
However, if you have any doubts or questions, do let me know in the comment section below. I will try to help you as soon as possible.
Happy Pythoning!