Here we are going to see how to iterate over files in a directory. Iterate over files is means loop through files. Five different methods are there to loop through files in the directory. We are going to loop files using for loop. The for loop is one of the most comfortable loops to iterate. Let us see now how to iterate over files in a directory using python.
Python provides five different methods to iterate over files in a directory. os.listdir(), os.scandir(), pathlib module, os.walk(), and glob module are the methods available to iterate over files. A directory is also known as a folder. It is a collection of files and subdirectories. The module os is useful to work with directories. We can do a lot of works using the os module.
What is the OS module in python?
The os is a module that is already available in python. This module is useful for working with directories. The module is useful to change the current working directory, iterate over files, get the working directories, etc. It has a function to interact with an operating system.
5 Ways in Python to loop through files in a directory
- os.listdir()
- os.scandir()
- pathlib module
- os.walk()
- glob module
1. Using os.listdir() in Python to loop through files in a directory
What is os.listdir()?
If we want to get the list of all files and the specified directory’s directory, we have to use os.listdir(). The list of all files and directories in the current working directory will be returned when the directory is not specified.
Syntax
os.listdir(path)
Parameter
path of the directory, optional
Returns
list of all files, and directories of the specified path.
1.1 Code to get the list of directories in a specified path
import os
path_of_the_directory= 'E:\Python for Data Science'
print("Files and directories in a specified path:")
for filename in os.listdir(path_of_the_directory):
f = os.path.join(path_of_the_directory,filename)
if os.path.isfile(f):
print(f)
First, importing the os module that is necessary to work in a directory. Giving the path of the directory. Creating for loop to iterate every time to get the list of files in a specified path. The statement if is useful to check whether the directory is available in a specified path. If the file is there, it will display the list of files; otherwise, it shows a filenotfound error.
Output
Files and directories in a specified path: E:\Python for Data Science\FAQs.pdf E:\Python for Data Science\Lec-1.pdf E:\Python for Data Science\Lec-2.pdf E:\Python for Data Science\Lec-3.pdf E:\Python for Data Science\Lec-4.pdf E:\Python for Data Science\Lec-5.pdf
1.2. Code to get the list of files in a current working directory
import os
directory_list = os.listdir()
print("Files and directories in current working directory :")
print(directory_list)
We already know that if the path is not specified, it will display the list of files in a current working directory. Importing an os module. We don’t specify the path—next printing the list of files in a current working directory.
Output
Files and directories in current working directory : ['binomial coefficeint.py', 'DLLs', 'Doc', 'file directories.py', 'generate color.py', 'include', 'is_integer.py', 'Lib', 'libs', 'LICENSE.txt', 'matplotlib.py', 'nan.py', 'NEWS.txt', 'python.exe', 'python3.dll', 'python39.dll', 'pythonw.exe', 'script.py', 'Scripts', 'stringbuilder.py', 'tcl', 'Tools', 'vcruntime140.dll', 'vcruntime140_1.dll', '__pycache__']
1.3. Iterate over files with certain extension using os.listdir()
import os
path_of_the_directory = 'E:\drivers'
ext = ('.pdf','.exe')
for files in os.listdir(path_of_the_directory):
if files.endswith(ext):
print(files)
else:
continue
Importing os module. Giving the path of the directory. Here we are going to get the files with certain extensions. Creating for loop to iterate over a path. Suppose the statement is used to check the extensions. The files with a given extension are displayed; others are ignored.
Output
Python.pdf C programming.pdf Java.pdf LearnEngineering.in.pdf DriverEasy_Setup.exe driver_booster_setup.exe python-3.9.6-amd64.exe sp58516.exe sp59647.exe sp63302.exe sp64031.exe sp64949.exe
Recommended Reading | Apex Ways to Get Filename From Path in Python
2. Using os.scandir() in Python to loop through files in a directory
What is os.scandir()?
If we want to get an iterator of the os.directoryentry object, we have to use the scandir method. Here the path of the directory is a required one.
Syntax
os.scandir(path = ‘.’)
Parameter
path- the path of the directory, required
Return
iterator of os.DirEntry objects.
2.1. Code
import os
path_of_the_directory = 'E:\Python for Data Science'
object = os.scandir(path_of_the_directory)
print("Files and Directories in '% s':" % path_of_the_directory)
for n in object :
if n.is_dir() or n.is_file():
print(n.name)
object.close()
Importing an os module. Giving the path of the directory. Next using os.scandir() to get the iterator of the DirEntry of the specified path. Creating for loop to iterate every time to get an iterator in a specified path. The statement if is useful to check whether the directory is available in a specified path. If the file is there, it will display the list of files; otherwise, it shows a filenotfound error.
Output
Files and Directories in 'E:\Python for Data Science': FAQs.pdf Lec-1.pdf Lec-2.pdf Lec-3.pdf Lec-4.pdf Lec-5.pdf
2.2. Iterate over files with certain extension using os.scandir()
import os
path_of_the_directory = 'E:\drivers'
ext = ('.exe', 'pdf')
for files in os.scandir(path_of_the_directory):
if files.path.endswith(ext):
print(files)
Importing os module. Giving the path of the directory. Here we are going to get the files with certain extensions. Creating for loop to iterate over a path. Suppose the statement is used to check the extensions. The files with a given extension are displayed. Others are ignored.
Output
<DirEntry 'Python.pdf'> <DirEntry 'C Programming.pdf'> <DirEntry 'Java.pdf'> <DirEntry 'LearnEngineering.in.pdf'> <DirEntry 'DriverEasy_Setup.exe'> <DirEntry 'driver_booster_setup.exe'> <DirEntry 'python-3.9.6-amd64.exe'> <DirEntry 'sp58516.exe'> <DirEntry 'sp59647.exe'> <DirEntry 'sp63302.exe'> <DirEntry 'sp64031.exe'> <DirEntry 'sp64949.exe'>
Must Read | Python Check if File Exists
3. Using pathlib module in Python to loop through files in a directory
What is the pathlib module?
pathlib is a module that helps us to work with paths in python. pathlib offers a capability to make that process of ensuring your path works on windows, mac, and Linux.
Types of pathlib
pathlib are of two types. Pure paths and concrete paths are the types of pathlib.
3.1. Code
from pathlib import Path
path_of_the_directory = 'E:\Python for Data Science'
print("Files and directories in a specified path:")
file = Path(path_of_the_directory ).glob('*')
for i in file:
print(i)
From pathlib module importing path. Giving the path of the directory. glob() yields all the files of the specified directory. Creating for loop to iterate the files. Next printing the files in the specified directory.
Output
Files and directories in a specified path: E:\Python for Data Science\FAQs.pdf E:\Python for Data Science\Lec-1.pdf E:\Python for Data Science\Lec-2.pdf E:\Python for Data Science\Lec-3.pdf E:\Python for Data Science\Lec-4.pdf E:\Python for Data Science\Lec-5.pdf
3.2. Iterate over files with certain extension using pathlib module
from pathlib import Path
path_of_the_directory = 'E:\drivers'
paths = Path(path_of_the_directory).glob('**/*.exe')
for path in paths:
print(path)
Importing path module. Giving the path of the directory. Here we are going to get the files with certain extensions. Creating for loop to iterate over a path. Getting the files that contain extension .exe.
Output
E:\drivers\DriverEasy_Setup.exe E:\drivers\driver_booster_setup.exe E:\drivers\python-3.9.6-amd64.exe E:\drivers\sp58516.exe E:\drivers\sp59647.exe E:\drivers\sp63302.exe E:\drivers\sp64031.exe E:\drivers\sp64949.exe
4. Using os.walk() in Python to loop through files in a directory
What is os.walk()?
The os.walk() modules give us a list of the files or directories of a specified path. walk() module gives the directory tree either by walking bottom to up or by top to bottom.
Syntax
os.walk(top[, topdown=True[, onerror=None[, followlinks=False]]])
Parameters
- top
- topdown
- oneerror
- followlinks
Return
list of all files and directories of the specified path.
4.1. Code
import os
path_of_the_directory = 'E:\Python for Data Science'
print("Files and directories in a specified path:")
for root, dirs, files in os.walk(path_of_the_directory):
for i in files:
print(os.path.join(root, i))
First, importing the os module that is necessary to work in a directory. Giving the path of the directory. Creating for loop to iterate every time to get the list of files in a specified path. The statement if is useful to check whether the directory is available in a specified path. If the file is there, it will display the list of files. Otherwise, it shows a filenotfound error.
Output
Files and directories in a specified path: E:\Python for Data Science\FAQs.pdf E:\Python for Data Science\Lec-1.pdf E:\Python for Data Science\Lec-2.pdf E:\Python for Data Science\Lec-3.pdf E:\Python for Data Science\Lec-4.pdf E:\Python for Data Science\Lec-5.pdf
4.2. Iterate over files with certain extension using os.walk()
import os
path_of_the_directory = 'E:\drivers'
ext = ('.pdf')
for path, dirc, files in os.walk(path_of_the_directory):
for name in files:
if name.endswith(ext):
print(name)
Importing os module. Giving the path of the directory. Here we are going to get the files with certain extensions. Creating for loop to iterate over a path. Suppose the statement is used to check the extensions. The files with a given extension are displayed; others are ignored.
Output
Python.pdf C programming.pdf Java.pdf LearnEngineering.in.pdf
5. Using glob module in Python to loop through files in a directory
What is the glob module?
The glob module returns all the files and directories of a specified path. A question mark (?) matches exactly one character, whereas an asterisk (*) matches zero or more characters.
5.1. Code
import glob
path_of_the_directory = 'E:\Python for Data Science'
print("Files and directories in a specified path:")
for filename in glob.iglob(f'{path_of_the_directory }/*'):
print(filename)
Importing glob module. Giving the path of the directory. Creating for loop to iterate over files. Printing the files and directory of a specified path.
Output
Files and directories in a specified path: E:\Python for Data Science\FAQs.pdf E:\Python for Data Science\Lec-1.pdf E:\Python for Data Science\Lec-2.pdf E:\Python for Data Science\Lec-3.pdf E:\Python for Data Science\Lec-4.pdf E:\Python for Data Science\Lec-5.pdf
5.2. Iterate over files with certain extension using glob module
import glob
for i in glob.glob('E:\drivers\\**\\*.pdf', recursive=True):
print(i)
Importing glob module. Creating for loop to iterate and giving the path of the directory. Printing the files with certain extensions.
Output
E:\drivers\Python.pdf E:\drivers\C Programming.pdf E:\drivers\Java.pdf E:\drivers\LearnEngineering.in.pdf
Frequently Asked Questions Related to Python loop through files in directory
Python provides five different methods to iterate over files in a directory. os.listdir(), os.scandir(), pathlib module, os.walk(), and glob module are the methods available to iterate over files.
os.scandir() was introduced in the version of Python 3.5
pathlib module is introduced in the version of Python 3.4
If the path is not specified in os.listdir(), it will display the files of the current working directory.
Conclusion
Here we came to the end of the article. Now we all well about iterate over files in a directory. These methods are straightforward to understand. The above-mentioned methods are also useful for many purposes in python.