How to remove duplicates from a list in Python
Introduction
As a beginner programmer, you might come across situations where you have to deal with duplicate items in a list. In this tutorial, we will explore different techniques to remove duplicates from a list in Python. No worries if you are not familiar with some terms, we will explain them as we go along.
Understanding Lists in Python
Before diving into the details of removing duplicates from a list, let's first understand what a list is. In Python, a list is a built-in data structure that can hold a collection of items. These items can be of different data types like integers, strings, or even other lists. Lists are ordered, mutable, and can have duplicate elements.
Here is an example of a list containing different types of data:
my_list = [1, "apple", 3.14, [1, 2, 3]]
Now that we know what a list is let's see how we can remove duplicate items from a list.
Technique 1: Using a Loop and an Empty List
One of the simplest ways to remove duplicates from a list is by using a loop and an empty list. The idea is to iterate through each element of the list and check if it is already present in the new list. If it is not present, then add it to the new list.
Here is an example:
def remove_duplicates(input_list):
unique_list = []
for item in input_list:
if item not in unique_list:
unique_list.append(item)
return unique_list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(my_list)
print(unique_list) # Output: [1, 2, 3, 4, 5]
In this example, we define a function called remove_duplicates
that takes a list as input and returns a new list with duplicates removed. The function iterates through each item in the input list and checks if it is not present in the unique_list
. If it is not present, it adds the item to the unique_list
.
Technique 2: Using List Comprehension
List comprehension is a concise way to create lists in Python. It is an elegant and efficient way to write a single line of code that can perform a specific task. In this technique, we will use list comprehension to remove duplicates from a list.
Here is an example:
def remove_duplicates(input_list):
unique_list = [item for i, item in enumerate(input_list) if item not in input_list[:i]]
return unique_list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(my_list)
print(unique_list) # Output: [1, 2, 3, 4, 5]
In this example, we use list comprehension to create a new list called unique_list
that contains only unique items from the input list. The enumerate
function is used to get both the index i
and the item from the input list. We then check if the item is not present in the slice of the input list before the current index. If it is not present, we add the item to the unique_list
.
Technique 3: Using a Set
A set is another built-in data structure in Python that can hold a collection of unique items. Sets are unordered, mutable, and do not allow duplicate elements. We can utilize the properties of a set to remove duplicates from a list easily.
Here is an example:
def remove_duplicates(input_list):
unique_list = list(set(input_list))
return unique_list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(my_list)
print(unique_list) # Output: [1, 2, 3, 4, 5]
In this example, we convert the input list to a set using the set()
function. This removes duplicates since sets can only have unique elements. Then, we convert the set back to a list using the list()
function. Keep in mind that this method does not maintain the order of elements in the input list.
Technique 4: Using a Dictionary
Another way to remove duplicates from a list is by using a dictionary. A dictionary is a built-in data structure in Python that can store a collection of key-value pairs. Dictionaries are unordered, mutable, and do not allow duplicate keys. We can use the properties of a dictionary to remove duplicates from a list.
Here is an example:
def remove_duplicates(input_list):
unique_dict = {item: None for item in input_list}
unique_list = list(unique_dict.keys())
return unique_list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(my_list)
print(unique_list) # Output: [1, 2, 3, 4, 5]
In this example, we use a dictionary comprehension to create a new dictionary called unique_dict
with the items from the input list as keys and None
as the value for each key. Since dictionaries do not allow duplicate keys, this removes duplicates from the input list. Then, we convert the keys of the dictionary to a list using the list()
function and the keys()
method.
Technique 5: Using a Collection's OrderedDict
An OrderedDict is a dictionary that maintains the order of the keys. It is a part of the collections
module in Python. We can use an OrderedDict to remove duplicates from a list while preserving the order of elements.
Here is an example:
from collections import OrderedDict
def remove_duplicates(input_list):
unique_list = list(OrderedDict.fromkeys(input_list))
return unique_list
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = remove_duplicates(my_list)
print(unique_list) # Output: [1, 2, 3, 4, 5]
In this example, we import the OrderedDict
class from the collections
module. Then, we create an OrderedDict from the input list using the fromkeys()
method. This removes duplicates while maintaining the order of elements. Finally, we convert the keys of the OrderedDict to a list using the list()
function.
Conclusion
In this tutorial, we learned about different techniques to remove duplicates from a list in Python. Each technique has its pros and cons, and you can choose the one that best suits your requirements. Some methods like using a set or dictionary may not maintain the order of elements, while others like using a loop or OrderedDict can preserve the order of elements in the input list. As you progress in your programming journey, you can experiment with these techniques and even come up with your own creative solutions to tackle similar problems. Happy coding!