Numpy Digitize() Function With Examples in Python

Hello Programmers, in today’s article, we will discuss the numpy digitize() function in python available with the Numpy module. The Numpy library in Python contains a multidimensional matrix and array data structures. Thus, making all the array-related aspects simpler and easier with the use of various functions is available. As Numpy stands for Numerical Python, we can use it for various mathematical operations on arrays.

NumPy Digitize() is used to get the indices of bins to which each of these values belongs in the input array. In simpler words, this function returns the bins to which each of the array’s values belongs. This method is critical to segregate many arrays into a group of arrays according to their values. The best example of this would be segregating students into A, B, and C grades according to their marks in the exam.

Before we start with this method and its ways of use, let me just brief you about the Numpy digitize() method.

Contents

What is Numpy digitize()

The Numpy digitize() function helps to get the indices of the bin to which each value of the input array belongs and returns an array containing the indices of the bin. Input array having the values and output array holding the indices of bins can be multi-dimensional.

Bins are 1D and monotonic. np.digitize() is implemented as np.searchsorted. This means that a binary search is used to bin the values, which scales better for a larger number of bins. Also, it removes the requirement for the input array to be 1-dimensional.

Syntax:

np.digitize(Array, Bin, Right)

Parameters

Array means the input array or array to be binned. Bin is an array of bins. Right indicates whether the intervals include the right or left edge. The right edge not included is the default.

Return Type:

An array containing indices of the bins

Examples of Numpy Digitize Function

Example:

# import numpy 
import numpy as np 
  
a = np.array([1.2, 2.4, 3.6, 4.8]) 
bins = np.array([1.0, 1.3, 2.5, 4.0, 10.0]) 
  
# using np.digitize() method 
g = np.digitize(a, bins) 
  
print(g)

Output:

[1 2 3 4]

In the above example, np.digitize function returns an array holding the indices of all the values of bins.

Different ways of implementing Numpy digitize() function in Python are –

Placing values in two bins
All values in three bins
Counting the frequency of the bins

Placing values in Two Bins using Numpy digitize()

EXAMPLE:

import numpy as np

#creating data
d = [2, 4, 4, 7, 12, 14, 19, 20, 24, 31, 34]

#placing values into bins
np.digitize(d, bins=[20])

OUTPUT:

array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1])

EXPLANATION:

The above example shows how to place the values of an array into two bins. the two bins are defined as- 1. 0 if x < 20 2. 1 if x >= 20. It thus returns an array with an index of 0 for all values below 20 and 1 for those above 20.

All values were placed in Three Bins

EXAMPLE:

import numpy as np

#create data
d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34]

#place values into bins
np.digitize(d, bins=[10, 20])

OUTPUT:

array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])

EXPLANATION:

All the data values in the above example are placed in three bins. Bins defined in this example are as follows: 1. 0 if x ≤ 10 2. 1 if 10 < x ≤ 20 3. 2 if x > 20, It returns an array with an index of 0 for values under 10 and as 1 for values till 20 and 2 for values above 20. Since the syntax of np.digitize() excludes the right edge by default, the value 20 is index 2 in the given array. If we set Right = True, the index of 20 is returned as 1.

Frequency Count of Each Bin with numpy.digitize() function in Python

EXAMPLE:

import numpy as np

#create data
d = [2, 4, 4, 7, 12, 14, 20, 22, 24, 31, 34]

#place values into bins
bin_data = np.digitize(d, bins=[10, 20])

#view binned data
bin_d

#count frequency of each bin
np.bincount(bin_d)

OUTPUT:

array([0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2])
array([4, 2, 5])

EXPLANATION:

numpy.bincount() is a very useful method that complements the numpy.digitize() function. It counts the frequency of each bin. In this example, first, we place all values into three bins and then count the frequency of each. The output is thus given as – Count 4 as Bin “0” contains 4 data values. Count 2 as Bin “1” contains 2 data values. Count 5 as Bin “2” contains 5 data values.

Must Read:

Conclusion

We saw different ways of placing variables into bins using the numpy.digitize() function in Python. We also saw the use of numpy.bincount() function and how it complements the digitize function. The Numpy digitize() function raises a ValueError if the bins are not monotonic. It raises a TypeError if the type of the input array is complex. I hope this article helps you to the digitize() function as and when required.

Try to run the programs on your side, and let us know if you have any queries.

Happy Coding!