Shortcut to seniority

Graph traversal

Graph traversal means that we visit every vertex and edge exactly once, in a well-defined order. The order in which the vertices are visited is important and may depend upon the algorithm or problem that you are solving.

During a traversal, it is important to track which vertices have been visited.

Breadth-first search (BFS)

BFS is starting the transversal from a given vertex, and traverse the graph on each layer.
This is performed in two steps:

Move horizontally and visit all the vertices of the current layer
Move to the next layer

image with caption — Figure 31 Algorithms - Graphs - BFS

If the starting vertex is 0, we will cover the graph as follows: 0 1 2 3 4 5 6 7

This is similar to level-order traversal of a tree, except that in, unlike trees, graphs can contain cycles, so we may traverse the same node again. The solution will be to use a boolean visited array whose index will be marked visited as we traverse through the vertices.

Implementation details

So, we have a graph, and we want to iterate through all the vertices in that graph. Starting from a vertex s, we will start by looking at all the vertices in the graph that contain a path from s => all the closest / nearest neighbourns.

To remember that, using BFS, we find all the vertices that are at a distance x from s, before finding all the vertices that are at a distance x+1 from s.

The visited array is marked 0 for all the nodes [0-7] and the queue is empty. We start with vertex 0, so we mark it as true in the visited array, and then proceed to add it into the queue.

Queue : 0

Index	0	1	2	3	4	5	6
Visited	1	0	0	0	0	0	0

Since the queue is not empty, we dequeue the first element, and look at the adjacent vertexes of that node. The vertexes that are not visited will be set as visited and added into the queue.

So, what will happen?

Node 1 has the following adjacent vertices: 0, 2, 3, and 5. 0 and 2 are already visited, so we do nothing about them, whereas 4 and 5 will be marked as visited and added into the queue. Since the queue is not empty, they will be added at the end of the queue, after 2 and 3, which are still on layer 2, meaning that they will be processed after we finish with that layer.

Code


                                            class Graph {      
                                            public:      
                                            void addEdge(int from, int to) {      
                                                vertices[from].push_back(to);      
                                            }      
                                            
                                            void BFS(int initial);    
                                            private:  
                                            map<int, vector<int>> vertices;    
                                            };      
                                                
                                            void Graph::BFS(int initial) {      
                                            // The queue we need to use      
                                            list<int> queue;      
                                                
                                            map<int, bool> visited_values;  

                                            // Add starting vertex in the queue and set to visited      
                                            visited_values[initial] = true;      
                                            queue.push_back(initial);      
                                                
                                            // Iterate through the queue items      
                                            while (!queue.empty()) {      
                                                const int top = queue.front();      
                                                cout << top << " ";      
                                                queue.pop_front();      
                                                
                                                // Insert at the end of the queue all the items that are adjacent       
                                                // and not visited already      
                                                for (int item : vertices[top]) {      
                                                    if (visited_values.find(item) == visited_values.end()) {      
                                                        visited_values[item] = true;      
                                                        queue.push_back(item);      
                                                    }      
                                                }      
                                            }      
                                            }      
                                                
                                            int main() {      
                                            Graph g;      
                                            g.addEdge(0, 1);      
                                            g.addEdge(0, 2);      
                                            g.addEdge(1, 2);      
                                            g.addEdge(2, 0);      
                                            g.addEdge(2, 3);      
                                            g.addEdge(3, 3);      
                                                
                                            g.BFS(2);      
                                            return 0;      
                                            }

Discussion on code

The main() function is initially called, which constructs a Graph and calls BFS function with 2 as parameter. The value 2 implies that we should start our BFS implementation with 2 as the starting point.

The algorithm starts at line 12.

Line 14 introduces a queue of integers, in which we will insert the values we want to further visit.
Line 16 introduces a map called visited_values, which keeps track of the nodes that were already visited.

Initially, we add our initial node (2) in the queue, and mark it as visited (Line 19-20).

Until the queue is not empty (Line 23):

We pick the first element from the queue (Line 24), and we remove it from the queue (Line 26).
For each edge connected to that node (Line 30):
If the node was not already visited, we set it as visited, and add it into the queue for further processing.

Because of this logic, what will happen?

We start with node 2, and mark it as visited. We iterate through all neighbours of 2, and add them into the queue, while also marking them as visited. We finish with node 2, and we get the next item from the queue, which is one of the vertices directly connected to 2, and we insert all of its neighbours into the queue as well.

To be noted that we don’t introduce into the queue any nodes that were already visited – we do this in order to NOT process the same node multiple times, causing an infinite loop (e.g. Add 2 in queue, which adds 5, which adds 2, which adds 5…).

Depth-first Search (DFS)

DFS is a recursive algorithm, that uses backtracking internally. It involves searching all the nodes by going ahead, and when there are no more nodes to visit, it moves backwards on the same path to find more nodes to traverse.

All the nodes on the current path will be visited before it goes on the next path. The implementation can be done by using stacks.

Steps:

Pick a starting node.
Push all its adjacent nodes in the stack.
Pop a node from stack to select the next node to visit.
Push all of its adjacent nodes into the stack.
Repeat until stack is empty.

You need to ensure that the nodes that are visited were marked, otherwise you will end up in an infinite loop.

Implementation details

So, we have a graph, and we want to iterate through all the vertices in that graph.

Starting from a vertex S, we add all its adjacent node that are not visited yet, into the stack.

Then, we pop a node from the stack and we add the adjacent node of that node into the stack. This way, we will visit all the adjancent nodes from a given starting node, until there are no nodes left unvisited. When that happens, we pop the stack and continue to look for unvisited nodes.

We can deduce that there are 3 states:

Unvisited: Node was not reached yet.
Pushed: Node was pushed into the stack.
Visited: The adjacent nodes are into the stack, while this node was popped.

We repeat these steps until the stack is empty.

If there’s a node that was added twice (let’s say node 3 here, first added by initial node 0, 2^nd time added after iterating through adjacent nodes of 4) – we add it into the stack again.

When we pop the nodes to visit them, we check their state – if they were already visited, we skip them.

Step-by-step Algorithm

First of all, we need to set the node we start from - let's go with node 0.

Node 0 is pushed into the stack

Stack: 0

After visiting node 0:

We pop value 0 from the stack
we push values 8, 3, 1 into the stack
Next, we visit the top node in the stack (1)

Stack: 8 3 1

Next, we visit one of the unvisited neighbours of node 1

We pick node 7 to further work with.

After visiting node 1:

We pop value 1 from the stack
We push value 7 into the stack
Next, we visit the top node in the stack (7)

Stack: 8 3 7

Code


                                            class Graph {       
                                            map<int, vector<int>> vertices;      
                                            public:      
                                            void addEdge(int from, int to) {      
                                                vertices[from].push_back(to);      
                                            }      
                                                
                                            void DFS(int initial);      
                                            };       
                                                    
                                            void Graph::DFS(int initial) {      
                                            // The stack we need to use      
                                            stack<int> myStack;      
                                                
                                            map<int, bool> visited_values;      
                                                
                                            myStack.push(initial);      
                                                
                                            while (!myStack.empty()) {      
                                                const int top = myStack.top(); 
                                                // We pop the item before we add others into the stack
                                                myStack.pop();      

                                                if (!visited_values[top]) {      
                                                    std::cout << top << " ";      
                                                    visited_values[top] = true;      
                                                
                                                    for (int item : vertices[top]) {      
                                                        myStack.push(item);      
                                                    }      
                                                }      
                                            }      
                                            }      
                                                    
                                            int main() {       
                                            Graph g;      
                                            g.addEdge(0, 1);      
                                            g.addEdge(0, 2);      
                                            g.addEdge(0, 3);      
                                            g.addEdge(1, 3);      
                                            g.addEdge(2, 3);      
                                            g.addEdge(2, 4);      
                                                
                                                g.DFS(0);      
                                                return 0;       
                                            }

Discussion on code

The main() function is initially called, which constructs a Graph and calls DFS function with 0 as parameter. The value 0 implies that we should start our DFS implementation with 0 as the starting point.

The algorithm starts at line 11.

Line 13 introduces a stack of integers, which we will further use to insert nodes we want to further visit. Line 15 introduces a map called visited_values, which we will use to keep track of nodes that were already visited.

Initially, we push the initial value into the stack (Line 17).

Until the stack is empty (Line 19):

We retrieve and remove the first node from the stack (Line 20, 22).
If the node was not already visited:
We iterate through it’s neighbours and push them further into the stack.

Because of this logic, what will happen?

We initially add the node 0 into the stack. Because the stack is not empty, we pop the node from the stack. It is not visited, so we mark it as such and we add its direct nodes into the stack. Assuming 0 has 3 neighbours, the stack will now have 3 items. Then, we start again, popping the first item from the stack, marking it as visited, and pushing its direct nodes into the stack.

Assuming this node has 3 neighbours, the stack will then have 5 items (the current node was already popped).

Because stacks follow the LIFO principle, we will iterate through all the direct nodes of our starting point, and only afterwards we will go through the other nodes within the stack.

Cycle detection – Union Find

We will talk about Union-find algorithm, which is an algorithm that detects whether our graph contains cycles.

Union-Find algorithm can be used only if the graph does not contain any self-loops. Otherwise, the algorithm will not work.

A cycle occurs in this case, because by selecting any node as starting point, we can get back to that node after visiting some nodes.

Implementation details

We will use a data structure to keep track of elements that are split in non-overlapping subsets.

The Find function shall return which subset an element is in, in order to compare with the subset of another element. If the function returns the same value for different elements, it means that they are part of the same subset.

Union function is used to merge multiple subsets into a single one.

Step-by-Step Algorithm

The subsets will be tracked in an array, usually called parent. Initially, the parent array is initialized with -1, which means that there’s only one item in each subset. When we call find, we chase the parent until we reach -1. When -1 is found, we return the value of that node.

Example 1: Node 3 has the parent node -1 => We return 3
Example 2: Node 0 has the parent node 2 => Node 2 has the parent -1 => we return 2

Node	Parent
0	-1
1	-1
2	-1

Now we process all edges one by one.

Let’s start with edge 0-1.

Find

Check subsets:
Subset[0] = -1 => we return 0
Subset[1] = -1 => we return 1

They are in different subsets, so we merge them.

Merge

For merging the subsets, we make one node a parent of another. Either way can be done, so here are the possible values.

Node	Parent
0	1
1	-1
2	-1

Node	Parent
0	-1
1	0
2	-1

Another visualization to understand this concept is as follows:

We start with the following sets:

{0}, {1}, {2}

After this step, the sets will be {0,1}, {2}

Let’s go with the first table (node 1 as parent of 0) and settle with this configuration from now on =>
Given edge (x,y) - merging will set y as parent of x.

Let’s continue with edge 1-2.

Find

Check subsets:
Subset[1] = parent[0] = -1 => returns 0
Subset[2] = -1 => returns 2

They are in different subsets, so we merge them.

Merge

Before merging, the table looks like this

Node	Parent
0	1
1	-1
2	-1

Now, we want to merge the subset.
Since we said that we use the following formula
(Given edge (x,y) – merging will set the parent of x to y)
=> Parent[1] = 2

Node	Parent
0	1
1	2
2	-1

After this step, the sets will be {0,1,2}, {}

Let’s continue with edge 0-2.
Let’s discuss one more time how we go from subset[0] to 2:

In the given table above - We go forward the line:

Node 0 has parent 1 => we check the node 1 in the table
Node 1 has parent 2 => we check the node 2 in the table
Node 2 has parent -1 => we return 2.

Node	Parent
0	1
1	2
2	-1

Find

Check subsets:
Subset[0] = parent[1] = parent[2] = -1 => returns 2
Subset[2] = -1 => returns 2

Find() shall return value 2, for both subsets.
This means that this edge forms a cycle.

Code


                                        int Find(int parent[], int idx) {    
                                        if (parent[idx] == -1)    
                                            return idx;    
                                        return Find(parent, parent[idx]);    
                                        }    
                                            
                                        void Union(int parent[], int x, int y) {    
                                        int set_x = Find(parent, x);    
                                        int set_y = Find(parent, y);    
                                        parent[set_x] = set_y;    
                                        }

Discussion on code

Let’s talk about the Find() function (Line 1) - It receives the array and an index as parameter.

If the value of the array at that index is -1, we return that index. If not, it means that there’s a parent for that node, and we recursively call the function with the value of the parent, to find the parent of that subset.

Let’s talk about the Union() function (Line 7) - It receives the array and two values as parameters.

We call the Find function for both parameters, to find the subset they belong, and check whether they are part of the same subset (Line 8-9). Line 10 is merging the subsets by setting one to be the parent of the other one.

Shortcut to seniority

Badea Robert