diff --git a/SUMMARY.md b/SUMMARY.md index f43910f52..d2c612b67 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -44,6 +44,8 @@ * [Computer Graphics](contents/computer_graphics/computer_graphics.md) * [Flood Fill](contents/flood_fill/flood_fill.md) * [Quantum Information](contents/quantum_information/quantum_information.md) +* [Graph Algorithms](contents/graph_algorithms/README.md) + * [Dijkstra's Algorithm](contents/dijkstra/dijkstra.md) * [Cryptography](contents/cryptography/cryptography.md) * [Computus](contents/computus/computus.md) -* [Approximate Counting Algorithm](contents/approximate_counting/approximate_counting.md) \ No newline at end of file +* [Approximate Counting Algorithm](contents/approximate_counting/approximate_counting.md) diff --git a/contents/dijkstra/assets/sample_graph.png b/contents/dijkstra/assets/sample_graph.png new file mode 100644 index 000000000..4f3ce5817 Binary files /dev/null and b/contents/dijkstra/assets/sample_graph.png differ diff --git a/contents/dijkstra/code/python/dijkstra.py b/contents/dijkstra/code/python/dijkstra.py new file mode 100644 index 000000000..a5e004c44 --- /dev/null +++ b/contents/dijkstra/code/python/dijkstra.py @@ -0,0 +1,69 @@ +import sys +from itertools import product + +class Graph: + def __init__(self, vertices, edges): + self.vertices = vertices + self.edges = {v: {} for v in self.vertices} + + for x, y, dist in edges: + self.edges[x][y] = dist + self.edges[y][x] = dist + + def route_from(self, start): + current = start + unvisited = {x for x in self.vertices if x != start} + distances = {k: 0 if k == start else sys.maxsize for k in self.vertices} + paths = {k: [] if k == start else None for k in self.vertices} + + while len(unvisited) > 0: + # Check all edges incident to our current vertex, looking for improved paths to neighbors + for n, d in self.edges[current].items(): + if distances[current] + d < distances[n]: + distances[n] = distances[current] + d + paths[n] = paths[current] + ([current] if current != start else []) + + # Find the next vertex to iterate on. This should be a vertex for which we've found a path, but + # whose path is a shortest length of that to all the yet-unvisited vertices. + next_node = None + for n in unvisited: + if distances[n] < sys.maxsize and (next_node is None or distances[n] < distances[next_node]): + next_node = n + + if next_node is not None: + unvisited.remove(next_node) + current = next_node + else: + # This break handles the scenario in which our graph is disconnected, and we've found optimal paths + # to all vertices in the component we began our search in. + break + + # Clean-up our output to better indicate which vertices are disconnected from our start vertex + for n, d in distances.items(): + if d == sys.maxsize: + distances[n] = None + paths[n] = None + + return distances, paths + + +def main(): + vertices = list(range(1, 24)) # 1-23 + edges = [ + (1, 2, 5), (2, 3, 7), (3, 4, 1), (3, 11, 5), (3, 12, 7), (4, 5, 4), (4, 11, 5), (5, 6, 11), (5, 14, 20), (6, 7, 7), (6, 8, 11), (6, 14, 11), + (8, 9, 4), (8, 15, 2), (9, 10, 5), (10, 11, 12), (10, 12, 17), (12, 13, 10), (12, 16, 2), (13, 14, 5), (14, 15, 10), (15, 16, 9), (17, 18, 1), + (17, 20, 6), (18, 19, 7), (18, 21, 10), (19, 22, 4), (20, 21, 5), (21, 22, 4) + ] + g = Graph(vertices, edges) + + for start in vertices: + distances, paths = g.route_from(start) + print('Starting at {}:'.format(start)) + for end in vertices: + if distances[end] is not None: + print('\t-> {} with distance {} via path {}'.format(end, distances[end], paths[end])) + else: + print('\t-> {} does not exist'.format(end)) + +if __name__ == '__main__': + main() diff --git a/contents/dijkstra/dijkstra.md b/contents/dijkstra/dijkstra.md new file mode 100644 index 000000000..aab86cdba --- /dev/null +++ b/contents/dijkstra/dijkstra.md @@ -0,0 +1,23 @@ +# Dijkstra's Algorithm +Edsger Dijkstra was one of the foremost computer scientists of the 20th century. He pioneered a large body of work that is integral to modern theoretical computer science and software engineering. A graph algorithm bearing his name is but one of his major contributions, but it is one to which he is most often associated. The algorithm is used to efficiently find shortest paths between vertices in a weighted graph, with the condition that the graph satisfies the [Triangle Inequality](https://en.wikipedia.org/wiki/Triangle_inequality). The edge weights of the graph may be any positive numbers. + +The algorithm is fairly simple and uses the assumption about the triangle inequality to ignore potential paths that can't be shorter than ones already known. Two variations of the algorithm exist: one which finds the shortest path between two particular vertices `x` and `y`, terminating once one such path is found, and one which finds the shortest path between a start vertex `x` and all other vertices in the same connected component of the graph. These two variations are functionally the same, but the former is able to terminate once its narrower goal is met. We will focus on the latter variant, but the code for the former is very similar. + +To begin, start at the first vertex `x` and make a set of all the other "unvisited" vertices in the graph. Initialize a structure to store the shortest known path to all vertices, with a path of zero weight to the start vertex and infinity (or a suitably large constant) to all others. Next, look at all the neighbors of `x`. By the triangle inequality, we know that following each adjacent edge to `x` will yield a shortest path to each of the neighbors of `x`. Thus, we have found shortest paths to each of these. Record these shortest paths (both the vertices crossed and the total path weight) in the auxilary data structure(s). + +Now, step to the vertex `z` whose path weight is the lowest. Remove that vertex from the unvisited set. Repeat the above process from this vertex, noting that a shortest path to each of `z`'s neighbors must go through `z` and will have a total path weight of the `x`-`z` path plus the weight of the edge being traversed. Repeat this process of improving shortest known paths, choosing vertices from the unvisited set whose known paths are the shortest, and checking neighbors until one of two terminating conditions is met. Either the unvisted set of vertices is exhausted (meaning we have traversed the entire graph), or no remaining unvisted vertices have known paths to them. This latter condition is met when the graph is disconnected, leaving us with vertices that cannot be reached from a start vertex via any path. Once either of these conditions is met, then we have found the shortest paths from `x` to all connected vertices. + +The below example uses the following graph as a sample. + +![alt text](assets/sample_graph.png "Example Dijkstra Graph") + +{% method %} +{% sample lang="py" %} +[import:1-69, lang:"python"](code/python/dijkstra.py) +{% endmethod %} + + + + diff --git a/contents/graph_algorithms/README.md b/contents/graph_algorithms/README.md new file mode 100644 index 000000000..917e14642 --- /dev/null +++ b/contents/graph_algorithms/README.md @@ -0,0 +1,4 @@ +# Graph Algorithms +Graph Theory is a beautiful intersection of pure, abstract math and pragmatic computer science. The idea of a graph is a simple one to describe, but with an incredibly deep well of applications, both theoretical and practical. A graph is simply a collection of objects called *vertices* or *nodes* and connections between them called *edges*. This structure can be augmented in a variety of ways: duplicate edges, loops on single vertices, directed edges, etc. These structures are incredibly important to computer science as a variety of real-world systems can be modelled as graphs: computer networks, physical utility connections, user interactions, the list goes on. + +Algorithms on these objects are thus very important if data is to be stored in such a way. Traversing graphs, finding vertices meeting some condition, optimizing various values, etc. This chapter focuses on a selection of prominent graph algorithms and circumstances in which they are often applied.