DFS in Python: 2 Algorithms I wish I have learned earlier

Posted by Marta on February 1, 2023 Viewed 6799 times

Card image cap

This article will explain in plain English what DFS( Depth For Search) and BFS(Breath For Search) algorithms are. We will focus in Python, however the principles are applicable to any other programming language. Don’t let the names intimidate you. These algorithms are much easier that they look.

Once I learned how they work and find out how easy the mechanism underneath I regret I haven’t learned this much earlier.

This article aims to give you an approachable and practical explanation of these algorithms. You will learn how DFS and BFS are different, implement DFS and BFS in python, and some real-world applications of these algorithms.

Depth For Search or DFS

DFS and BFS are both mazes exploring strategies; in other words, each defines a different approach to decide how to explore the nodes in a graph or the paths in a maze. This exploring algorithm can be useful if we want to find if two points in a graph are connected. The algorithm will first need to explore the maze. Suppose you want to see if there is a way to get from your place to your friend’s home by car. First, you need to explore a map, and all paths starting from your home. And see if any of those paths lead to your friend’s house.

Usually, you will explore the paths in random order; DFS and DFS give you a systematic way to explore the map. 

DFS Application

A different and exciting application of DFS is finding strongly connected components. What does that mean? A strongly connected component is a group of nodes with a lot of connections between them. For instance, Social Networks are one of the exciting applications. A strong connected component in a social network could be representing a group of people with many relations between them. That could imply all these people are friends, friends of friends, or work at the same company.

The web network world uses graphs and DFS as well. Each website is a node, and the hyperlinks are the edges connecting the nodes. A strongly connected component in these scenarios can represent a set of websites that belongs to the same domain.

These are examples of how you can extract meaning out of networks by exploring their nodes and their connections.

DFS pseudocode

DFS exploration strategy algorithm follows a path to the end, only backtracking if it gets to the path end. DFS keeps track of two things: visited nodes and the nodes left to explore, using a stack(FIFO: First In First Out) to keep track of the nodes left to explore. Meaning the last node discovered is the next node to be explored. Meaning it will follow the current path to the end. Let’s see an example. Imagine the below graph:

The following video shows how to explore the graph following the DFS strategy, starting from the F node:

Pseudocode:

1. Initialise stack where_to_go_next
2. Add the origin to where_to_go_next stack
3. Initialise visited nodes
4. Get the current_node neighbours nodes
5. While ( where_to_go_next is not empty)
	6. Current_node = get first node from where_to_go_next stack
    7. Add to current_node to visited nodes
    8. Get neighbourds of the current_node
    9. For each neighbourd
    	10. Add to where_to_go_next stack

DFS python code – Iterative

Now that we understand how the DFS algorithm works let’s see how to translate this pseudocode into python. This code snippet only includes the DFS implementation. You can find the code that builds the graph and the stack further down in this article. Put them all together in the same .py extension file to get the code running.

def dfs(graph,origin):
    where_to_go_next = Stack()
    where_to_go_next.add(origin)
    already_visited = []

    while not where_to_go_next.is_empty():
        current_node = where_to_go_next.pop()
        print(current_node)
        already_visited.append(current_node)
        neighbours = graph[current_node.hashkey()]
        for neighbour in neighbours:
            if neighbour not in already_visited:
                where_to_go_next.add(neighbour)
                
matrix = [[1,1,1,1],[0,1,0,1],[1,0,0,1],[1,1,1,1]]
graph = build_graph(matrix)
dfs(graph,Cell(3,3))

First, the code will initialize the visited node list and the stack containing the nodes to explore. The code will loop until the stack is empty, which means examining all graph nodes. Once the algorithm went through all nodes, the code has completed the exploration.

In each loop iteration, we will pick the last node inserted in the stack, mark it as visited. Then get all adjacent nodes, and insert these adjacent nodes to the stack, so they are next for exploration.

DFS python code – Recursive

So far, we have seen how you can implement DFS in an iterative approach using a stack. However, DFS implementation can also be recursive. We will define two things: the end case and how to divide the problem. We reached the end case when the algorithm examined all nodes.

In case there are still nodes to visit. First, we will mark the current or origin node as seen and then run a DFS scan for each adjacent node. In this case, no stack is necessary since the computer manages the recursion, which has the same role as the stack.

def dfs_recursive(graph, origin, visited = []):
    if len(graph.keys())==len(visited): # all nodes visited
        return None
    else:
        visited.append(origin)
        print(origin)
        neighbours =  graph[origin.hashkey()]
        for neighbour in neighbours:
            if neighbour not in visited:
                dfs_recursive(graph, neighbour, visited)

DFS vs BFS

What’s the difference between DFS and BFS? The critical difference is that BFS uses a queue instead of a stack to keep track of the nodes left to explore. This change will alter the order in which the algorithm discovers the nodes. Instead of following a path to the end, it will examine the graph in “layers.”

DFS Build Python Graph

The following python code illustrates a possible way of building a graph using a dictionary, where the keys are the nodes themselves, and the values are a list of the adjacent nodes.

def build_graph(area):
    graph = {}
    for row in range(len(area)):
        for col in range(len(area[row])):
            if (area[row][col] != 0):
                current_cell = Cell(row, col)
                cell_doesnt_exist_on_graph = graph.get(current_cell.hashkey()) == None
                if(cell_doesnt_exist_on_graph):
                    graph[current_cell.hashkey()]= []
                # Check the top, left, down and right cells to know the cell that are connected
                adjacents = graph.get(current_cell.hashkey())
                if(row-1>-1):
                    if(area[row-1][col]!=0):
                        adjacents.append(Cell(row-1,col))
                if(col-1>-1):
                    if(area[row][col-1]!=0):
                        adjacents.append(Cell(row,col-1))
                if(row+1<len(area)):
                    if (area[row+1][col]!=0):
                        adjacents.append(Cell(row+1,col))
                if(col+1<len(area[row])):
                    if(area[row][col+1]!=0):
                        adjacents.append(Cell(row,col+1))
                graph[current_cell.hashkey()] = adjacents

    return graph

The Cell class will represent each node. Each node is defined by its position, row, and col, in an imaginary matrix, where the top-left corner is the matrix’s origin.

class Cell:
    def __init__(self, row, col):
        self.row = row
        self.col = col

    def hashkey(self):
        return str(self.row)+str(self.col)

    def __eq__(self, other):
        return self.row == other.row and self.col == other.col

    def __str__(self):
        return str(self.row)+str(self.col)

Stack in Python

This python code illustrates how to implement a Stack following FIFO(First in First Out)

class Stack:
    def __init__(self):
        self.list = []

    def add(self,item):
        if item:
            self.list.append(item)

    def pop(self):
        if len(self.list) == 0: return None
        last_item = self.list[-1]
        del(self.list[-1])
        return last_item

    def print(self):
        print(self.list)

    def is_empty(self):
        return len(self.list) == 0

Queue in Python

The python code below illustrates how to implement a Queue, which follows LIFO( Last In First Out)

class Queue:
    def __init__(self):
        self.list = []

    def add(self,item):
        if item:
            self.list.append(item)

    def pop(self):
        if len(self.list) == 0: return None
        last_item = self.list[0]
        del(self.list[0])
        return last_item

    def is_empty(self):
        return len(self.list) == 0

Conclusion

Summarising, DFS and BFS are both exploring algorithms that will help you to research a graph. DFS will follow a single path until it gets stuck and then go in a different direction. BFS explores the graph by layers. We have covered how to implement DFS in python. I hope you enjoyed the article, and thanks for reading and supporting this blog!

What’s next?

Project-Based Programming Introduction

Steady pace book with lots of worked examples. Starting with the basics, and moving to projects, data visualisation, and web applications

100% Recommended book for Java Beginners

Unique lay-out and teaching programming style helping new concepts stick in your memory

90 Specific Ways to Write Better Python

Great guide for those who want to improve their skills when writing python code. Easy to understand. Many practical examples

Grow Your Java skills as a developer

Perfect Boook for anyone who has an alright knowledge of Java and wants to take it to the next level.

Write Code as a Professional Developer

Excellent read for anyone who already know how to program and want to learn Best Practices

Every Developer should read this

Perfect book for anyone transitioning into the mid/mid-senior developer level

Great preparation for interviews

Great book and probably the best way to practice for interview. Some really good information on how to perform an interview. Code Example in Java