Analyzing the data is one of the significant aspects we focus on while working on machine learning or deep learning. An enormous number of python libraries help us in doing that. This makes python a developer’s favorite language. One such library for analyzing datasets is pandas. However, its functionalities are beyond this article, but today in this article, we will try to understand one of its attributes, i.e., iloc. So. let’s get started.
Pandas.DataFrame.iloc
Pandas is an open-source Python package that is most widely used for data science/data analysis and machine learning tasks. DataFrame is one of its classes used to represent data in table format. Iloc is one of its attributes used to access data from dataframe objects. It is purely integer-location-based indexing for selection by position. It is similar to accessing list elements using its index position. However, its return type may vary in different scenarios. When we access only one row of dataframe, it returns a series object. Otherwise, if we try to access more than one row, it returns a dataframe object. To understand it more clearly, let’s see its syntax first and then examples.
Syntax:
dataframe_object.iloc[<position>]
- position: position represents the index position that we want to access.
Let’s see some examples.
Example 1: Accessing one row using iloc
import pandas as pd
# Creatinng series object for datframe
record1 = pd.Series({'Name':'Alice','Class':'Physics','Score':85})
record2 = pd.Series({'Name':'Jack','Class':'Chemistry','Score':87})
record3 = pd.Series({'Name':'Helen','Class':'Maths','Score':93})
# Converting series into dataframe
df = pd.DataFrame([record1,record2,record3],index = ['Stu1','Stu2','Stu3'])
print(df)
print()
# Accessing a row from dataframe
print(df.iloc[1])
print(type(df.iloc[1])) #Checking datatype of returned object
Output:
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
Stu3 Helen Maths 93
Name Jack
Class Chemistry
Score 87
Name: Stu2, dtype: object
<class 'pandas.core.series.Series'>
In the above example, we passed the index position we wanted to access. There it returns a series object.
Example 2: Accessing Multiple Rows using iloc
However, sometimes it happens that we want to access more than one row from the dataframe table; there, we can use two approaches. The first one uses a list of indexes, and the other one is slicing. Let’s see how we can do it.
import pandas as pd
# Creatinng series object for datframe
record1 = pd.Series({'Name':'Alice','Class':'Physics','Score':85})
record2 = pd.Series({'Name':'Jack','Class':'Chemistry','Score':87})
record3 = pd.Series({'Name':'Helen','Class':'Maths','Score':93})
# Converting series into dataframe
df = pd.DataFrame([record1,record2,record3],index = ['Stu1','Stu2','Stu3'])
print(df)
print()
# Accessing multiple row using list of index and checking its datatype
print(df.iloc[[0,1]])
print(type(df.iloc[[0,1]]))
print()
#Accessing multiple row using slicing
print(df.iloc[0:2])
print(type(df.iloc[0:2]))
Output:
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
Stu3 Helen Maths 93
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
<class 'pandas.core.frame.DataFrame'>
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
<class 'pandas.core.frame.DataFrame'>
Example 3: Accessing index using Boolean mask
The third method we can access the element using iloc attribute is using the boolean mask. We will pass the list of true values for those we want to access and false ones we don’t want to access.
import pandas as pd
# Creatingn series object for datframe
record1 = pd.Series({'Name':'Alice','Class':'Physics','Score':85})
record2 = pd.Series({'Name':'Jack','Class':'Chemistry','Score':87})
record3 = pd.Series({'Name':'Helen','Class':'Maths','Score':93})
# Converting series into dataframe
df = pd.DataFrame([record1,record2,record3],index = ['Stu1','Stu2','Stu3'])
print(df)
print()
# Accessing multiple row using boolean mask
print(df.iloc[[0,1]])
print(type(df.iloc[[True, False, True]]))
Output:
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
Stu3 Helen Maths 93
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
<class 'pandas.core.frame.DataFrame'>
Indexing Both Axes in Iloc
So, as explained above, we are indexing only an axis to get row values. However, we can also index both the axes to get a particular value or a data frame. The first argument refers to the row number, and the second argument to the column number. If we want to access an element from the second row and third column, we will pass [1,2] as the argument. It means we are looking for the 1st-row index position and the 3rd column index position. Let’s see an example.
Example 4: Accesing Single element in dataframe
import pandas as pd
record1 = pd.Series({'i':'00','j':'01','k':'02','l':'03'})
record2 = pd.Series({'i':'10','j':'11','k':'12','l':'13'})
record3 = pd.Series({'i':'20','j':'21','k':'22','l':'23'})
record4 = pd.Series({'i':'30','j':'31','k':'32','l':'33'})
# Creating dataframe
df = pd.DataFrame([record1,record2,record3,record4],index = ['0','1','2','3'])
print(df)
# Accessing element from the second row and third column
print()
print("Accessing single element i.e.",df.iloc[1,2])
Output:
i j k l
0 00 01 02 03
1 10 11 12 13
2 20 21 22 23
3 30 31 32 33
Accessing single element i.e. 12
Example 5: Assigning Values using iloc
import pandas as pd
# Creatinng series object for datframe
record1 = pd.Series({'Name':'Alice','Class':'Physics','Score':85})
record2 = pd.Series({'Name':'Jack','Class':'Chemistry','Score':87})
record3 = pd.Series({'Name':'Helen','Class':'Maths','Score':93})
# Converting series into dataframe
df = pd.DataFrame([record1,record2,record3],index = ['Stu1','Stu2','Stu3'])
df.iloc[2,1]= 'Biology'
print(df)
Output:
Name Class Score
Stu1 Alice Physics 85
Stu2 Jack Chemistry 87
Stu3 Helen Biology 93
So, in the above example, we append the code. We are assigning the value using the iloc attribute. We first access the cell and then assign the value to that cell to do that.
Example 8: Drop index using Iloc
However, sometimes we are also required to drop an index in the pandas dataframe. But it is worth memorizing that we can’t drop an index using the iloc attribute on any dataframe. So, to drop an index from a dataframe, we use the drop() method. Let’s see the example.
print(df.drop('Stu2'))
Output:
Name Class Score
Stu1 Alice Physics 85
Stu3 Helen Maths 93
Example 6: Filtering Dataframe using iloc
However, selecting both axes gives us the functionality of accessing desired columns and rows and creating a separate dataframe from that. We will pass the list of desired rows and desired columns separately. Let’s see the example.
import pandas as pd
record1 = pd.Series({'i':'00','j':'01','k':'02','l':'03'})
record2 = pd.Series({'i':'10','j':'11','k':'12','l':'13'})
record3 = pd.Series({'i':'20','j':'21','k':'22','l':'23'})
record4 = pd.Series({'i':'30','j':'31','k':'32','l':'33'})
# Creating dataframe
df = pd.DataFrame([record1,record2,record3,record4],index = ['0','1','2','3'])
print(df)
# Accessing 2nd and 4th row and 2nd and 3rd column
print()
print("Filtered dataframe")
print(df.iloc[[1,3],[1,2]])
new_header = df.iloc[0] # Accessing 0th index object
df = df[1:] # Acessing df from 1st row to end
df.columns = new_header #set the header row as the df header
Output:
i j k l
0 00 01 02 03
1 10 11 12 13
2 20 21 22 23
3 30 31 32 33
Filtered dataframe
j k
1 11 12
3 31 32
Example 7: Pandas Replacing Header with Top Row Using Iloc
Now, we can also assign or remove a header using the iloc attribute of the dataframe. Let’s see how we can do it.
Sample Dataframe:
Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4
0 Sample Number Group Number Sample Name Group Name
1 1.0 1.0 s_1 g_1
2 2.0 1.0 s_2 g_1
3 3.0 1.0 s_3 g_1
4 4.0 2.0 s_4 g_2
In the above case, if we want to remove the header from the dataframe and want to assign the 0th index object as the header, we will use the following line of code.
FAQs on Pandas ILOC
The difference between iloc and loc is that iloc is used to access the element by specifying the index position. In contrast, loc is used to access the element by specifying the index name.
iloc in pandas is used to specify the row of the dataframe we want to access, however in numpy. We use indexing to access the element from an array object.
If we want to access the element based on the name of the row’s index, we can use the loc attribute, while when we are not specifying the name of the row’s index and using only position, we will use the iloc attribute.
Conclusion
So, today in this article, we learned about dataframes iloc attribute. We understood why we use it. After that, we have seen how we can use it differently to get our desired result. I hope this article has helped you. Thank You.