Unzip a File in Python: 5 Scenarios You Should Know

If you are in the IT world for a while, you must know about the unzipping or extraction of a file. More than 70% of the files present on the internet are in the form of a  lossless compressed file (ex. zip, jpeg, tar, rar). There are various apps available in the market to unzip a file. One of the most popular software to unzip or extract is Winrar. But in today’s article, we will learn how we can unzip a file in Python.

In this particular tutorial, we will learn how to deal with 5 different scenarios, when we want to extract or unzip a file in Python. For all those who don’t know what extraction is let me briefly explain it to you. Unzip is a term used to describe the process of decompressing and moving one or more files in a compressed zipped file to an alternate location.

Let’s see various conditions in which we can extract a file using Python.

5 Situations in Which You Can Extract a File Using Python

  • Extracting only one file
  • Unzip all / multiple files from a zip file to the current directory
  • Extracting all the Files into another directory
  • Unzipping only some specific files based on different conditions
  • Unzipping Password Protected zip file using extractall()

Time to Code!

Module Used to Unzip Files in Python

To extract a file using Python, we will use the zipfile module in Python. The zip file module is used to access functionalities that would help us create, read, write, extract and list a ZIP file in Python.

Syntax

ZipFile.extractall(path=None, members=None, pwd=None)

Parameters

  • path: This path parameter stores a path to the directory where the zip files need to be unzipped. The file is extracted in the current working directory if it is not specified.
  • Members: This parameter is used to add the files list to be extracted. If no argument is provided, it will extract all the files.
  • pwd: This parameter is used to extract an encrypted file with a password.

1. Extracting only one file

Sometimes, we only require a specific file from the zip file to do our task. In such cases, extracting the whole zip file will consume time as well as the memory of your computer. To avoid this, there is an option to extract the specific file/folder from the zip and save it on your computer. The following code demonstrates it –

Code –

Currently, our zip file contains three files – “a.txt”, “p.txt”, and “pool.txt”

import zipfile

zip_file = "a.zip"
file_to_extract = "a.txt"

try:
    with zipfile.ZipFile(zip_file) as z:
        with open(file_to_extract, 'wb') as f:
            f.write(z.read(file_to_extract))
            print("Extracted", file_to_extract)
except:
    print("Invalid file")

Output –

Extracted a.txt

Explanation –

Firstly, we import the module zipfile in the first line. After that, we need to declare some constant variables which can be used later in the code. In this specific example, we’ve extracted “a.txt” from the zip file “a.zip”. With the help of the ZipFile.read() function, you can store the binary value of the file in a variable and this variable can be dumped on the local file to extract it.

2. Unzip all / multiple files from a zip file to the current directory in Python

Extracting all the files using python is one of the best features of python. If you have python installed on your computer, you don’t even need software like Winrar to extract zip files. Moreover, you can not only extract zip files but also, ‘.tar.gz’, ‘.apk’, ‘.rar’ and other formats too. What’s more exciting is that you don’t need to declare any type of zipping. The module automatically identifies whether it’s zip, rar, or other formats.

Code –

import zipfile

zip_file = "a.zip"

try:
    with zipfile.ZipFile(zip_file) as z:
        z.extractall()
        print("Extracted all")
except:
    print("Invalid file")

Output –

Extracted all

Explanation –

As a default, we import the zipfile module in the first file. Then we use the context manager to use the ZipFile class. With the help of this method, you don’t need to close the zipfile after opening it. By using the extractall() method, you can extract all the files in the current directory of your python code.

3. Extracting all the Files into another directory in Python

It’s very important to learn file management in coding. You have to create a better space for your files to avoid any hassle in the same folder. By using the extractall method with the parameter you can extract the zip to any folder you want. If the directory is not present, the module will automatically create a new empty directory.

Code –

import zipfile

zip_file = "a.zip"

try:
    with zipfile.ZipFile(zip_file) as z:
        z.extractall("temp")
        print("Extracted all")
except:
    print("Invalid file")

Output –

Extracted all

Explanation –

We import the zipfile module and open the zipfile by using ZipFile class. With the help of extractall(path), you can extract all the contents inside the zip file in the specifically mentioned path. If the directory is already present, all the files will be overwritten. If the directory is absent, a new empty folder will be created.

4. Unzipping only some specific files based on different conditions in Python

It’s very convenient to extract specific types of files from the zip file. Suppose, you have a zip file that contains images, videos, and other types of files, and you need to extract only images. Then conditional extracting will help you. You can use namelist() method with conditions to extract them.

Code –

Currently, our zip file contains three files – “a.txt”, “p.txt”, “pool.txt”, “a.jpeg”, “b.jpeg” and “a.mp4”

import zipfile

zip_file = "a.zip"
endswith = ".jpeg"
try:
    with zipfile.ZipFile(zip_file) as z:
        for file in z.namelist():
            if file.endswith(endswith):
                z.extract(file)
        print("Extracted all ", endswith)
except:
    print("Invalid file")

Output –

Extracted all  .jpeg

Explanation –

We begin with importing the zipfile module. Then we declare some constants and others the zip file as a context manager. All the file names from the zip file can be obtained by the namelist() method. This method returns the list of all file objects. Nextly we check if the file ends with a specific file format (In our case we used ‘.jpeg’ format).

If the file ends with a specific extension, then we can extract the file. All the files will be extracted in the same folder. If you want to extract them to a specific folder, use the path parameter in the extractall() method.

5. Unzipping Password Protected Zip Files using extractall() in Python

One of the features of zip files is that you can apply a password to them. Moreover, you can also encrypt the file names to make it more difficult. In this section, we’ll only discuss the basic password manager for the zip files and how to use them.

Note – Every software uses a different encryption technique and it may happen that you cannot extract it using python. If you are aware of proper encryption for your zip file, use pyzipper module to create a proper zip object.

Code –

import zipfile

zip_file = "a.zip"
try:
    with zipfile.ZipFile(zip_file) as z:
        z.setpassword(bytes("pythonpool","utf-8"))
        z.extractall()
        print("Extracted all ")
except Exception as e:
    print("Invalid file", e)

Output –

Extracted all

Explanation –

We import the zipfile module and open the zip file using the context manager. The setpassword method allows you to initialize a password for your zip file. The only condition is that it takes bytes object as input. We’ve used the bytes() function to convert our string into bytes. Finally, using the extractall() method, you can unzip all the files from the zip.

If the zip file is compressed by software like WinRar, you need to use my zipper module which has advanced methods to determine the algorithms.

Must Read

Final Words

Python has proved to be one of the best computer programming languages, with the support of thousands of modules. With easy-to-write syntaxes and APIs for every system module, you can create beautiful applications to ease your tasks. Python unzip file is one feature that allows you to automate your extraction tasks.

Share this post with your friends and let them know about this awesome module!

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments