Shortcut to seniority

Home
Go to main page

Section level: Junior
A journey into the programming realm

Section level: Intermediate
The point of no return

Section level: Senior
Leaping into the unknown
Go to main page
A journey into the programming realm
The point of no return
Leaping into the unknown
Chapter warning
This chapter will contain some algorithms that are usually learned in school.
It is great if you're able to follow the problem and the solution, but if not, do not discourage yourself!
Algorithms are great for exposing you to different problem-solving techniques. However, you will not use them in your everyday programming, and if you actually do need one, you can always look them up on the internet.
Therefore, it's a good idea to read this chapter, but if you are stuck or simply get bored, feel free to go directly to Chapter 3 and come back later on.
Graph traversal means that we visit every vertex and edge exactly once, in a well-defined order. The order in which the vertices are visited is important and may depend upon the algorithm or problem that you are solving.
During a traversal, it is important to track which vertices have been visited.
BFS is starting the transversal from a given vertex, and traverse the graph on each layer.
This is performed in two steps:
If the starting vertex is 0, we will cover the graph as follows: 0 1 2 3 4 5 6 7
This is similar to level-order traversal of a tree, except that in, unlike trees, graphs can contain cycles, so we may traverse the same node again. The solution will be to use a boolean visited array whose index will be marked visited as we traverse through the vertices.
So, we have a graph, and we want to iterate through all the vertices in that graph. Starting from a vertex s, we will start by looking at all the vertices in the graph that contain a path from s => all the closest / nearest neighbourns.
To remember that, using BFS, we find all the vertices that are at a distance x from s, before finding all the vertices that are at a distance x+1 from s.
The visited array is marked 0 for all the nodes [0-7] and the queue is empty. We start with vertex 0, so we mark it as true in the visited array, and then proceed to add it into the queue.
Queue : 0
Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
Visited | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
Since the queue is not empty, we dequeue the first element, and look at the adjacent vertexes of that node. The vertexes that are not visited will be set as visited and added into the queue.
So, what will happen?
Node 1 has the following adjacent vertices: 0, 2, 3, and 5. 0 and 2 are already visited, so we do nothing about them, whereas 4 and 5 will be marked as visited and added into the queue. Since the queue is not empty, they will be added at the end of the queue, after 2 and 3, which are still on layer 2, meaning that they will be processed after we finish with that layer.
class Graph {
public:
void addEdge(int from, int to) {
vertices[from].push_back(to);
}
void BFS(int initial);
private:
map<int, vector<int>> vertices;
};
void Graph::BFS(int initial) {
// The queue we need to use
list<int> queue;
map<int, bool> visited_values;
// Add starting vertex in the queue and set to visited
visited_values[initial] = true;
queue.push_back(initial);
// Iterate through the queue items
while (!queue.empty()) {
const int top = queue.front();
cout << top << " ";
queue.pop_front();
// Insert at the end of the queue all the items that are adjacent
// and not visited already
for (int item : vertices[top]) {
if (visited_values.find(item) == visited_values.end()) {
visited_values[item] = true;
queue.push_back(item);
}
}
}
}
int main() {
Graph g;
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
g.BFS(2);
return 0;
}
The main() function is initially called, which constructs a Graph and calls BFS function with 2 as parameter. The value 2 implies that we should start our BFS implementation with 2 as the starting point.
The algorithm starts at line 12.
Line 14 introduces a queue of integers, in which we will insert the values we want to further visit.
Line 16 introduces a map called visited_values, which keeps track of the nodes that were already visited.
Initially, we add our initial node (2) in the queue, and mark it as visited (Line 19-20).
Until the queue is not empty (Line 23):
Because of this logic, what will happen?
We start with node 2, and mark it as visited. We iterate through all neighbours of 2, and add them into the queue, while also marking them as visited. We finish with node 2, and we get the next item from the queue, which is one of the vertices directly connected to 2, and we insert all of its neighbours into the queue as well.
To be noted that we don’t introduce into the queue any nodes that were already visited – we do this in order to NOT process the same node multiple times, causing an infinite loop (e.g. Add 2 in queue, which adds 5, which adds 2, which adds 5…).
DFS is a recursive algorithm, that uses backtracking internally. It involves searching all the nodes by going ahead, and when there are no more nodes to visit, it moves backwards on the same path to find more nodes to traverse.
All the nodes on the current path will be visited before it goes on the next path. The implementation can be done by using stacks.
You need to ensure that the nodes that are visited were marked, otherwise you will end up in an infinite loop.
So, we have a graph, and we want to iterate through all the vertices in that graph.
Starting from a vertex S, we add all its adjacent node that are not visited yet, into the stack.
Then, we pop a node from the stack and we add the adjacent node of that node into the stack. This way, we will visit all the adjancent nodes from a given starting node, until there are no nodes left unvisited. When that happens, we pop the stack and continue to look for unvisited nodes.
We can deduce that there are 3 states:
We repeat these steps until the stack is empty.
If there’s a node that was added twice (let’s say node 3 here, first added by initial node 0, 2nd time added after iterating through adjacent nodes of 4) – we add it into the stack again.
When we pop the nodes to visit them, we check their state – if they were already visited, we skip them.
First of all, we need to set the node we start from - let's go with node 0.
Stack: 0
After visiting node 0:
Stack: 8 3 1
Next, we visit one of the unvisited neighbours of node 1
We pick node 7 to further work with.
After visiting node 1:
Stack: 8 3 7
class Graph {
map<int, vector<int>> vertices;
public:
void addEdge(int from, int to) {
vertices[from].push_back(to);
}
void DFS(int initial);
};
void Graph::DFS(int initial) {
// The stack we need to use
stack<int> myStack;
map<int, bool> visited_values;
myStack.push(initial);
while (!myStack.empty()) {
const int top = myStack.top();
// We pop the item before we add others into the stack
myStack.pop();
if (!visited_values[top]) {
std::cout << top << " ";
visited_values[top] = true;
for (int item : vertices[top]) {
myStack.push(item);
}
}
}
}
int main() {
Graph g;
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(0, 3);
g.addEdge(1, 3);
g.addEdge(2, 3);
g.addEdge(2, 4);
g.DFS(0);
return 0;
}
The main() function is initially called, which constructs a Graph and calls DFS function with 0 as parameter. The value 0 implies that we should start our DFS implementation with 0 as the starting point.
The algorithm starts at line 11.
Line 13 introduces a stack of integers, which we will further use to insert nodes we want to further visit. Line 15 introduces a map called visited_values, which we will use to keep track of nodes that were already visited.
Initially, we push the initial value into the stack (Line 17).
Until the stack is empty (Line 19):
Because of this logic, what will happen?
We initially add the node 0 into the stack. Because the stack is not empty, we pop the node from the stack. It is not visited, so we mark it as such and we add its direct nodes into the stack. Assuming 0 has 3 neighbours, the stack will now have 3 items. Then, we start again, popping the first item from the stack, marking it as visited, and pushing its direct nodes into the stack.
Assuming this node has 3 neighbours, the stack will then have 5 items (the current node was already popped).
Because stacks follow the LIFO principle, we will iterate through all the direct nodes of our starting point, and only afterwards we will go through the other nodes within the stack.
We will talk about Union-find algorithm, which is an algorithm that detects whether our graph contains cycles.
Union-Find algorithm can be used only if the graph does not contain any self-loops. Otherwise, the algorithm will not work.
A cycle occurs in this case, because by selecting any node as starting point, we can get back to that node after visiting some nodes.
We will use a data structure to keep track of elements that are split in non-overlapping subsets.
The Find function shall return which subset an element is in, in order to compare with the subset of another element. If the function returns the same value for different elements, it means that they are part of the same subset.
Union function is used to merge multiple subsets into a single one.
The subsets will be tracked in an array, usually called parent. Initially, the parent array is initialized with -1, which means that there’s only one item in each subset. When we call find, we chase the parent until we reach -1. When -1 is found, we return the value of that node.
Example 1: Node 3 has the parent node -1 => We return 3
Example 2: Node 0 has the parent node 2 => Node 2 has the parent -1 => we return 2
Node | Parent |
---|---|
0 | -1 |
1 | -1 |
2 | -1 |
Now we process all edges one by one.
Let’s start with edge 0-1.
Find
Check subsets:
Subset[0] = -1 => we return 0
Subset[1] = -1 => we return 1
They are in different subsets, so we merge them.
Merge
For merging the subsets, we make one node a parent of another. Either way can be done, so here are the possible values.
Node | Parent |
---|---|
0 | 1 |
1 | -1 |
2 | -1 |
Node | Parent |
---|---|
0 | -1 |
1 | 0 |
2 | -1 |
Another visualization to understand this concept is as follows:
We start with the following sets:
{0}, {1}, {2}
After this step, the sets will be {0,1}, {2}
Let’s go with the first table (node 1 as parent of 0) and settle with this configuration from now on =>
Given edge (x,y) - merging will set y as parent of x.
Let’s continue with edge 1-2.
Find
Check subsets:
Subset[1] = parent[0] = -1 => returns 0
Subset[2] = -1 => returns 2
They are in different subsets, so we merge them.
Merge
Before merging, the table looks like this
Node | Parent |
---|---|
0 | 1 |
1 | -1 |
2 | -1 |
Now, we want to merge the subset.
Since we said that we use the following formula
(Given edge (x,y) – merging will set the parent of x to y)
=> Parent[1] = 2
Node | Parent |
---|---|
0 | 1 |
1 | 2 |
2 | -1 |
After this step, the sets will be {0,1,2}, {}
Let’s continue with edge 0-2.
Let’s discuss one more time how we go from subset[0] to 2:
In the given table above - We go forward the line:
Node | Parent |
---|---|
0 | 1 |
1 | 2 |
2 | -1 |
Find
Check subsets:
Subset[0] = parent[1] = parent[2] = -1 => returns 2
Subset[2] = -1 => returns 2
Find() shall return value 2, for both subsets.
This means that this edge forms a cycle.
int Find(int parent[], int idx) {
if (parent[idx] == -1)
return idx;
return Find(parent, parent[idx]);
}
void Union(int parent[], int x, int y) {
int set_x = Find(parent, x);
int set_y = Find(parent, y);
parent[set_x] = set_y;
}
Let’s talk about the Find() function (Line 1) - It receives the array and an index as parameter.
If the value of the array at that index is -1, we return that index. If not, it means that there’s a parent for that node, and we recursively call the function with the value of the parent, to find the parent of that subset.
Let’s talk about the Union() function (Line 7) - It receives the array and two values as parameters.
We call the Find function for both parameters, to find the subset they belong, and check whether they are part of the same subset (Line 8-9). Line 10 is merging the subsets by setting one to be the parent of the other one.
Vertex: Node containing data
Edge: The line that connects two vertices
Cycle: Cycles occur when nodes are connected in a way that they forms a closed structure.
Cycle: Cycles occur when nodes are connected in a way that they forms a closed structure.
Self loop: An edge that connects a vertex to itself (source = destination).
Divide et impera (Divide and conquer): Break the problem into sub-problems and solve them each recursively.
Negative weight edge refer to edges with a negative cost.