Hello Programmers, we will discuss the numpy digitize() function in python available with the Numpy module in today’s article. Numpy library in Python contains a multidimensional matrix and array data structures. Thus making all the array related aspects simpler and easier with the use of various functions is available. As Numpy stands for Numerical Python, we can use it for various mathematical operations on arrays.
NumPy Digitize() is used to get the indices of bins to which each of these values belongs in the input array. In simpler words, this function returns the bins to which each of the array’s values belongs. This method is critical to segregate many arrays into a group of arrays according to their values. The best example of this would be segregating students to A, B, C grades according to their marks in the exam.
Before we start with this method and its ways of use, let me just brief you about what is Numpy digitize() method.
What is Numpy digitize()
Numpy digitize() function helps to get the indices of the bin to which each value of the input array belongs and returns an array containing the indices of the bin. Input array having the values and output array holding the indices of bins can be multidimensional. Bins are 1D and monotonic. np.digitize() is implemented as np.searchsorted. Means that a binary search is used to bin the values which scales better for larger number of bins. Also it removes the requirement for the input array to be 1-dimensional.
np.digitize(Array, Bin, Right)
Array means the input array or array to be binned. Bin is an array of bins. Right indicates whether the intervals include the right or left edge. The right edge not included is the default.
Array containing indices of the bins
Examples of Numpy Digitize Function
# import numpy import numpy as np a = np.array([1.2, 2.4, 3.6, 4.8]) bins = np.array([1.0, 1.3, 2.5, 4.0, 10.0]) # using np.digitize() method g = np.digitize(a, bins) print(g)
[1 2 3 4]
In the above example, np.digitize function returns an array holding the indices of all the values of bins.
Different ways of implementing Numpy digitize() function in Python are –
- Placing values in two bins
- All values in three bins
- Counting the frequency of the bins
Placing values in Two Bins using Numpy digitize()
import numpy as np #creating data d = [2, 4, 4, 7, 12, 14, 19, 20, 24, 31, 34] #placing values into bins np.digitize(d, bins=)
array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1])
The above example shows how to place the values of an array into two bins. the two bins are defined as- 1. 0 if x < 20 2. 1 if x >= 20. It thus returns an array with an index as 0 for all values below 20 and as 1 for those above 20.
All values placed in Three Bins
import numpy as np #create data d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34] #place values into bins np.digitize(d, bins=[10, 20])
array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])
All the data values in the above example are placed in three bins. Bins defined in this example is as follows: 1. 0 if x ≤ 10 2. 1 if 10 < x ≤ 20 3. 2 if x > 20 It returns an array with index as 0 for values under 10 and as 1 for values till 20 and 2 for values above 20. Since the syntax of np.digitize() excludes the right edge by default, the value 20 is index 2 in the given array. If we set Right = True, the index of 20 is returned as 1.
Frequency Count of Each Bin with numpy.digitize() function in Python
import numpy as np #create data d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34] #place values into bins bin_data = np.digitize(d, bins=[10, 20]) #view binned data bin_d #count frequency of each bin np.bincount(bin_d)
array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2]) array([4, 2, 5])
numpy.bincount() is a very useful method that complements the numpy.digitize() function. It counts the frequency of each bin. In this example, first, we place all values into three bins and then count the frequency of each. The output is thus given as – Count 4 as Bin “0” contains 4 data values. Count 2 as Bin “1” contains 2 data values. Count 5 as Bin “2” contains 5 data values.
- How to Convert String to Lowercase in
- How to Calculate Square Root
- User Input | Input () Function | Keyboard Input
- Best Book to Learn Python in 2020
We saw different ways of placing variables into bins using the the numpy.digitize() function in python. Also saw the use of numpy.bincount() function and how it complements the digitize function. Numpy digitize() function raises an ValueError if the bins are not monotonic. And raises a TypeError if the type of the input array is complex. Hope this article helps you to the digitize() function as and when required.
Try to run the programs on your side and let us know if you have any queries.