How to Convert Numpy Array to Pandas Dataframe

Introduction

In Python, many ways exist to convert a numpy array to a pandas dataframe. But, sometimes, we are asked to solve with particular methods. So, In this tutorial, we will see all the methods to convert a numpy array to a pandas dataframe.

What is a Numpy Array?

Numpy arrays are the grid of values of the same type and indexed by a tuple of non-negative integers.

import numpy as np
arr = np.array((1, 2, 3, 4, 5))
print(arr)

Output:

[1 2 3 4 5]

What is Pandas Dataframe?

Pandas Dataframe is two-dimensional, whose size is mutable, and which are potentially heterogeneous tabular data structures with labeled rows and columns.

import pandas as pd
lst = ['Latracal', 'Solution', 'an', 'online', 
            'portal', 'for', 'languages']
df = pd.DataFrame(lst)
print(df)

Output:

       0
0   Latracal
1   Solution
2         an
3     online
4     portal
5        for
6  languages

Syntax of Pandas Dataframe

pandas.DataFrame(data=None, index=None, columns=None)

Parameter of Pandas Dataframe

  • data: It is the input as a numpy array, dictionary.
  • Index: This input is used for the resulting dataframe.
  • Columns: These are the column labels for the resulting dataframe.

Steps to Convert Numpy Array to Pandas Dataframe

  1. Import the modules pandas and numpy.
  2. Create the numpy array.
  3. Create the list of index values and column values for the DataFrame.
  4. Then, create the dataframe.
  5. At last, display the dataframe.

Various Examples to Convert Numpy Array to Pandas Dataframe

Let us understand the conversion of the numpy array to a pandas dataframe with the help of different methods and ways explained in detail with the help of examples:

1. Using Numpy array from random.rand method to Convert Numpy array to Pandas Dataframe

In this example, we will take the input of the numpy array from random.rand() function in numpy. and then apply the dataframe syntax to convert it to a pandas dataframe.

#import numpy and pandas module
import numpy as np 
import pandas as pd 
  
arr = np.random.rand(4, 4) 
print("Numpy array : ",arr ) 
  
# conversion into dataframe 
df = pd.DataFrame(arr, columns =['A', 'B', 'C', 'D']) 
print("\nPandas DataFrame: ")
print(df)

Output:

Numpy array :  [[0.93845309 0.89059495 0.51480681 0.06583541]
 [0.94972596 0.55147651 0.40720578 0.86422873]
 [0.53556404 0.7760867  0.80657461 0.37336038]
 [0.21177783 0.90187237 0.53926327 0.06067915]]
Pandas DataFrame:
       A         B         C         D
0  0.938453  0.890595  0.514807  0.065835
1  0.949726  0.551477  0.407206  0.864229
2  0.535564  0.776087  0.806575  0.373360
3  0.211778  0.901872  0.539263  0.060679

Explanation:

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array from random.rand() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of columns from A to D. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

2. Using numpy array with random.rand and reshape()

In this example, we will be taking the input in random.rand().reshape() function. Secondly, we will apply the dataframe syntax with the index values and columns and print the converted dataframe from the numpy module.

#import module: numpy and pandas
import numpy as np 
import pandas as pd 
  
arr = np.random.rand(6).reshape(2, 3) 
print("Numpy array : " ,arr) 
  
# converting into dataframe 
df = pd.DataFrame(arr, columns =['1', '2', '3']) 
print("\nPandas DataFrame: ") 
print(df)

Output:

Numpy array :  [[0.05949315 0.66499294 0.39795918]
 [0.93026286 0.42710097 0.70753262]]
Pandas DataFrame: 
      1         2         3
0  0.059493  0.664993  0.397959
1  0.930263  0.427101  0.707533

Explanation:

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array from random.rand().reshape() method from the numpy module and print the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of columns from 1 to 4.

If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

3. Using numpy array to Convert Numpy array to Pandas Dataframe

In this example, we will take input from np.array() and then convert the numpy array to pandas dataframe through dataframe syntax.

#import module numpy and pandas
import numpy as np 
import pandas as pd   
  
arr = np.array([[1, 2], [3, 4]]) 
print("Numpy array : ",arr) 
  
# converting into dataframe 
df = pd.DataFrame(data = arr, index =["row1", "row2"],  
                  columns =["col1", "col2"]) 
  
print("\nPandas DataFrame: ") 
print(df)

Output:

Numpy array :  [[1 2]
 [3 4]]
Pandas DataFrame: 
       col1  col2
row1     1     2
row2     3     4

Explanation;

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array.

Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows from row1, row2, and columns from col1, col2. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

4. Creating an empty dataframe

In this example, we will show how to create an empty dataframe and then print it.

#import pandas module and numpy module
import pandas as pd
import numpy as np

df = pd.DataFrame(np.nan, index=[0,1,2], columns=['A'])
print(df)

Output:

   A
0 NaN
1 NaN
2 NaN

Explanation:

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have applied dataframe syntax without taking the input array from the numpy module. In the syntax, we have np.nan, which means all the array values are set to NaN, i.e., 0. In the function, rows are set with 0, 1, 2, and columns are set with A. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

5. Generating rows and columns through iteration

In this example, we will generate index columns and column headers through iteration.

#import module: numpy and pandas
import pandas as pd 
import numpy as np 
  
arr = np.array([[1, 2, 3],  
                       [4, 5, 6]]) 
   
df = pd.DataFrame(data = arr[0:, 0:], 
                        index = ['Row_' + str(i + 1)  
                        for i in range(arr.shape[0])], 
                        columns = ['Column_' + str(i + 1)  
                        for i in range(arr.shape[1])]) 
  
print(df) 

Output:

          Column_1  Column_2  Column_3
Row_1         1         2         3
Row_2         4         5         6

Explanation:

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows and columns with the help of iteration through for loop.

If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

6. Generating Rows And Columns before converting into a DataFrame

In this example, we will be taking input from a numpy array. Then, we will set the index columns and column headers separately, and after that, we will put the value of rows and columns inside the dataframe syntax.

#import module: numpy and pandas
import pandas as pd 
import numpy as np 
  
arr = np.array([[1, 2, 3],  
                       [4, 5, 6]]) 

index = ['Row_' + str(i)  
        for i in range(1, len(arr) + 1)] 
  
columns = ['Column_' + str(i)  
          for i in range(1, len(arr[0]) + 1)] 

df = pd.DataFrame(arr ,  
                        index = index, 
                        columns = columns) 
 
print(df) 

Output:

     Column_1  Column_2  Column_3
Row_1    1         2         3
Row_2    4         5         6

Explanation:

First, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array. Thirdly, we have set the value for the rows and columns in the variable name as Index and columns with the help of iteration through for loop.

Fourthly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows and columns with the values defined before the dataframe function. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and convert the array to the dataframe.

Conclusion

In this tutorial, we discussed creating pandas dataframe from the numpy array. We have also discussed how we can create and write the program for converting the numpy array to a pandas dataframe. All the examples are explained in detail for a better understanding. You can use any of the programs as per your requirements.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments