Hello programmers, in today’s article we will learn about the Numpy var() function. The Numpy variance function calculates the variance of Numpy array elements. Variance calculates the average of the squared deviations from the mean, i.e., var = mean(abs(x – x.mean())**2)e. Mean is normally calculated as x.sum() / N, where N = len(x). The variance is computed for the flattened array by default, otherwise over the specified axis.
Contents of Tutorial
Syntax of Numpy var():
numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>)
Parameter of Numpy Variance
a = Array containing elements whose variance is to be calculated
Axis = The default is none, which means computes the variance of a 1D flattened array. However, the axis can be int or tuple of ints. If they want the variance to be calculated along any particular axis or axes, respectively. (Optional)
dtype = Data type to use in computing the variance. Default is float64 for arrays of integer type. For arrays of float types it is the same as the array type.(Optional)
out = Alternate output array having the same dimension as that of the expected output. But the type is cast if needed. (Optional)
Ddof = Refers to “Delta Degrees of Freedom”: the divisor used in the calculation is N – ddof. Where N represents the number of elements. ddof is zero by default. (Optional)
Keepdims = If this is set to True. The axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. If the default value is passed. Then keepdims will not be passed through to the var() method of sub-classes of ndarray. However, any non-default value will be. (Optional)
Return type of Numpy var() function in Python:
Returns variance of the data elements of the input array. If
out=None, returns a new array containing the variance; otherwise, a reference to the output array is returned.
Example of Numpy Variance:
import numpy as np # create array array = np.arange(10) print(array) r = np.var(array) print("\nvariance: ", r)
In the above example. Numpy var() function is used to calculate the variance of an array created by the programmer. The optional parameters can be avoided while using the function in programs. The numpy var() functions return the variance accurately, bypassing the array whose variance is calculated.
Numpy Variance var() with desired dtype
import numpy as np # 1D array a = [20, 2, 7, 1, 34] print("array : ", a) print("var of array : ", np.var(a)) print("\nvar of array : ", np.var(a, dtype = np.float32)) print("\nvar of array : ", np.var(a, dtype = np.float64))
array : [20, 2, 7, 1, 34] variance of array : 158.16 variance of array : 158.16 variance of array : 158.16
In the above example, first, we print the variance of the given 1D array. When the dtype is not included. dtype is the data type we desire while computing the variance. It is optional and, by default, is float64 for integer type arrays. But when we include the dtype parameter and set its value other than the default. We get the output variance of the desired dtype. We have set the dtype here to flaot32 and float64, respectively.
Numpy Variance function in Python for multi dimensional array
import numpy as np # 2D array arr = [[2, 2, 2, 2, 2], [15, 6, 27, 8, 2], [23, 2, 54, 1, 2, ], [11, 44, 34, 7, 2]] # var of the flattened array print("\nvar of arr, axis = None : ", np.var(arr)) # var along the axis = 0 print("\nvar of arr, axis = 0 : ", np.var(arr, axis = 0)) # var along the axis = 1 print("\nvar of arr, axis = 1 : ", np.var(arr, axis = 1))
var of arr, axis = None : 236.14000000000004 var of arr, axis = 0 : [ 57.1875 312.75 345.6875 9.25 0. ] var of arr, axis = 1 : [ 0. 77.04 421.84 269.04]
In the above example, the given multidimensional array variance is calculated using different values for the axis parameter. When the axis is set to none, which is the default value, it calculates the flattened array variance. When the axis is set to 0, it calculates the given multi-dimensional array variance along the direction of rows. And when the axis is set to 1. It calculates the variance along the direction of columns, i.e., operations are performed over rows.
Numpy var() v/s Statistics var()
Statistics var() is used to calculate the variance of given array elements just like the Numpy var() function. However, it does not work well with a multi-dimensional array because:
Multidimensional arrays cannot be created using the statistics module. We need a Numpy library for that.
Also, there is no parameter to recognize which axis the variance is to be calculated for multidimensional arrays.
Syntax of Statistics var()
Syntax of Statistics var():
Where data is an array of valid numbers, including Decimal and Fraction values, this parameter is required. And, xbar is the mean of data. This parameter is optional. If not mentioned, then the mean is automatically calculated.
Example of Statistics var()
import statistics dataset = [21, 19, 11, 21, 19, 46, 29] output = statistics.variance(dataset) print(output)
In conclusion, this article provides you with all the information regarding the Numpy variance function in Python. The variance function is used to find the variance of a given data set. Importing the Numpy module gives access to create ndarray and perform operations. The operations will be like mean standard deviation. And variance over it using specific functions inbuilt in the Numpy module itself. You can refer to the above examples for any queries regarding the Numpy var() function in Python.
However, if you have any doubts or questions do let me know in the comment section below. I will try to help you as soon as possible.