Numpy Digitize() Function With Examples in Python

Hello Programmers, we will discuss the numpy digitize() function in python available with the Numpy module in today’s article. Numpy library in Python contains a multidimensional matrix and array data structures. Thus making all the array related aspects simpler and easier with the use of various functions is available. As Numpy stands for Numerical Python, we can use it for various mathematical operations on arrays.

Before we start with this method and its ways of use, let me just brief you about what is Numpy digitize() method.

What is Numpy digitize()

Numpy digitize() function helps to get the indices of the bin to which each value of the input array belongs and returns an array containing the indices of the bin. Input array having the values and output array holding the indices of bins can be multidimensional. Bins are 1D and monotonic. np.digitize() is implemented as np.searchsorted. Means that a binary search is used to bin the values which scales better for larger number of bins. Also it removes the requirement for the input array to be 1-dimensional.

Syntax:

np.digitize(Array, Bin, Right) 

Parameters

Array means the input array or array to be binned. Bin is an array of bins. Right indicates whether the intervals include the right or left edge. The right edge not included is considered as default.

Return Type:

Array containing indices of the bins

Examples of Numpy Digitize Function

Example:

# import numpy 
import numpy as np 
  
a = np.array([1.2, 2.4, 3.6, 4.8]) 
bins = np.array([1.0, 1.3, 2.5, 4.0, 10.0]) 
  
# using np.digitize() method 
g = np.digitize(a, bins) 
  
print(g) 

Output:

[1 2 3 4]

In the above example, np.digitize function returns an array holding the indices of all the values of bins.

Different ways of implementing Numpy digitize() function in Python are –

  • Placing values in two bins
  • All values in three bins
  • Counting the frequency of the bins

Placing values in Two Bins using Numpy digitize()

EXAMPLE:

import numpy as np

#creating data
d = [2, 4, 4, 7, 12, 14, 19, 20, 24, 31, 34]

#placing values into bins
np.digitize(d, bins=[20])

OUTPUT:

array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1])

EXPLANATION:

The above example shows how to place the values of an array into two bins. the two bins are defined as- 1. 0 if x < 20 2. 1 if x >= 20. It thus returns an array with an index as 0 for all values below 20 and as 1 for those above 20.

All values placed in Three Bins

EXAMPLE:

import numpy as np

#create data
d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34]

#place values into bins
np.digitize(d, bins=[10, 20])

OUTPUT:

array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])

EXPLANATION:

All the data values in the above example are placed in three bins. Bins defined in this example is as follows: 1. 0 if x ≤ 10 2. 1 if 10 < x ≤ 20 3. 2 if x > 20 It returns an array with index as 0 for values under 10 and as 1 for values till 20 and 2 for values above 20. Since the syntax of np.digitize() excludes the right edge by default, the index is given 2 for the value 20 in the given array. If we set Right = True, the index of 20 is returned as 1.

Frequency Count of Each Bin with numpy.digitize() function in Python

EXAMPLE:

import numpy as np

#create data
d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34]

#place values into bins
bin_data = np.digitize(d, bins=[10, 20])

#view binned data
bin_d

#count frequency of each bin
np.bincount(bin_d)

OUTPUT:

array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])
array([4, 2, 5])

EXPLANATION:

numpy.bincount() is a very useful method that complements the numpy.digitize() function. It counts the frequency of each bin. In this example, first, we place all values into three bins and then count the frequency of each. The output is thus given as – Count 4 as Bin “0” contains data values. Count 2 as Bin “1” contains data values. Count 5 as Bin “2” contains data values.

Must Read:

Conclusion

We saw different ways of placing variables into bins using the the numpy.digitize() function in python. Also saw the use of numpy.bincount() function and how it complements the digitize function. Numpy digitize() function raises an ValueError if the bins are not monotonic. And raises a TypeError if the type of the input array is complex. Hope this article helps you to the digitize() function as and when required.

Try to run the programs on your side and let us know if you have any queries.

Happy Coding!

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
x