Set Difference in Python: All You Need to Know

While programming in python language, every coder comes across using sets. They are the most basic and vital entities in the programming world.

Set can be defined as a data storing structure where data is not in order, has an index, can be mutated, and can be mathematically operated.

There are many operations that can be performed on sets. In this article, we are going to talk about an operation that, in simple words, gives us a difference between two or more sets.

What is the set difference?

When talking about the set difference in a mathematical way, one might think it is the subtraction of values between two or more sets. For example:

set x = {10, 20, 30}
set y = {5, 10, 20}
difference between set x and set y = {5, 10, 20}

But it is not actually true, set difference (union of elements of both – intersection of elements of both sets).

For example:

set x = {5, 20, 30}
set y = {15, 10, 20}
difference between set x and set y = {5, 30}

Note: there is a catch when carrying out “difference” operations between sets, which will be discussed further in the article.

Commands for set difference and their syntax

The set difference can be procured in three ways in Python. They all are simple and have their own advantages.

Method 1: Using “-” Operator

Using the “-” operator, this operator requires all the arguments on which it is being operated to be set and cannot take any other Iterable as an argument.

The syntax for using “- “is pretty simple. Let’s understand using an example:

x = {10, 20, 30}
y = {15, 10, 20}
a = x – y
b = y - x
print(a)
print(b)
Output:
{10, 30}
{15}

As you can see, when the syntax is written as set x – set y, the output returned are all the values that are unique only to set x, and similarly, when the syntax is written as set y – set x, the output returned are the values unique only to set y.

The note on the previous page referred to this way of how difference is actually obtained in python language.

Although this method is pretty simple in view of usage and syntax, we can only operate on two sets at once. This is a limitation of this command that can be overcome by employing a different method.

Method 2: Using “difference()” Command

Another method to get set difference is by using the command .difference(). Let’s understand how to use this command and the syntax of this command by using an example:

x = {10, 20, 30, 13}
y = {15, 10, 20, 17}
a = x.difference (y)
b = y.difference (x)
print(a)
print(b)
Output:  
{15, 17}
{30, 13}

We can also use this command to get the difference between more than two sets. For example:

x = {10, 20, 30, 13}
y = {15, 10, 20, 17}
z = {10, 15, 17, 10}
a =  x.difference (z)
Output: 
{30, 13}

Method 3: Using “symmetric.difference()” Command

When using the above two methods, you can only get unique elements of one set as an output. Many times you might want unique elements for all the sets. For the following requirement, there is a command called “symmetric.difference()”.

This command’s usage and syntax are similar to the “difference()” command. Let’s see with the help of an example:

x = {10, 20, 30, 13}
y = {15, 10, 20, 17}
a = x.symmetric_difference (y)
print(a)
Output:
{30, 13, 15, 17}

From the code above, you can see you get all the required unique elements from all the sets. Further, you can also append these values to another list and use them.

Difference Between All The Three Methods

All the three methods are quite similar, although there are some differences between them if you dive deep into their technicality, and they also have some differences in their use cases.

The Venn diagram below explains how all the commands are different from each other.

Difference Between symmetric.difference() and "-" Operator

Using set differences for lists

You can use the above-discussed methods to find the difference between two lists in Python.

You can use Method 1 & 2 for finding the asymmetric difference or, in simple words, unique elements of one list.

Method 3 can be used for finding the symmetric difference or all the unique elements of both lists.

Using Methods 1 & 2

To find the asymmetric difference between two lists using methods 1 & 2, you need to follow the steps given below:

  1. First, convert the lists to sets
  2. Apply Method 1 or Method 2
  3. Convert the sets back to lists

Let’s understand better with the help of an example

list_a = [10, 15, 16 ,9]
list_b = [10, 14, 9, 8]
x = list(set(list_a)-set(list_b)) # we have used method 1 here
y = list(set(list_b).difference(set(list_a))) # we have used method 2 here
print(x)
print(y)
Output:
[15, 16]
[14, 8]

From the code mentioned above, you can see here we are using list() and set() commands to convert lists from sets and vice versa.

Using Method 3

Method 3 can be used for finding symmetric differences between two lists by following the steps given below:

  1. Convert the lists to sets
  2. Apply Method 3
  3. Convert the sets back to lists
list_a = [10, 15, 16 ,9]
list_b = [10, 14, 9, 8]
x = list(set(list_b).symmetric_difference(set(list_a))) # we have used method 3 here
print(x)
Output:
[15, 16, 14, 8]

Using set difference for pandas data-frames

Dealing with data frames is quite common in programming. Here, we discuss how we can find unique values between two data-frames.

You can use Method 1 to 3 to find the difference between two pandas data-frames by following the steps given below:

  1. Convert the data-frames into sets of tuples
  2. Apply any Method discussed above to get desired results
  3. Convert the sets into lists.
  4. convert the lists into data-frames
a = pd.DataFrame({'col1':[10, 20, 30], 'col2':[20, 30, 40]})
b = pd.DataFrame({'col1':[40, 20, 50], 'col2':[60, 30, 50]})
x = set(map(tuple, df1.values))
y = set(map(tuple, df2.values))
z = pd.DataFrame(list(x.difference(y)))
print(z)
Output:
   col1  col2
0     4     6
2     5     5

Using set difference for numpy arrays

Numpy arrays are another crucial data type programmers, and developers have to work with. Here, we are going to discuss how to use the set difference in numpy arrays.

To use the set difference between numpy arrays, you have to follow the given syntax below.

import numpy as np

x = np.array([2, 4, 7, 8, 11, 17])
y= np.array([4, 8, 17])
z = np.setdiff1d(x, y)
print(z)
Output:
[2, 7, 11]

FAQs on Python Set Difference

Is set difference operator in Python commutative?

No, the set difference operator is not commutative. As discussed earlier in the article that one might misunderstand x-y = y-x, but actually, x-y is not equal to y-x.

What is the difference between difference and symmetric difference in Python?

The difference will return unique values for one set, whereas symmetric differences will return unique values for both sets.

Are set operations commutative?

Some of the set operations in Python are commutative, and some are not. For example, union and intersection operations are commutative, but the difference is not.

Conclusion:

We conclude that Python is an open-source language and can offer easy-to-use and handy commands & we discussed some of them in this article. A programmer might use them as they desire to get the results they want by building logic around these commands.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments