Kruskals algorithm in Processing

Minimum Spanning Tree (red path)
I was reading Robert Sedgewicks “Algorithms in Java” and had skipped to part 5 concerning Graphs.
I know I ought not skip portions of a book concerning something that difficult, but 130 pages into the first part I needed to see where all this theory was going.
I stumbled over the Kruskal/Prims/Dijkstra algorithms and the visual representations caught my eye. They were really simple to look at but you could sense there was some sort of genius system behind it. That is one of my favorite reasons for opening up Processing and start sketching.
A few need to know things on Graphs:
Thinking in 2 dimensions, a Graph consists of a number of points on a surface. A point is called a vertex and each vertex can be connected to a number of other vertices. Such a connection between 2, or more, vertexes is called and edge.

simple graph
There are many different types of Graphs: directed graphs, where directed means, that two vertices that are connected also states which direction you can travel from one vertex to the other, e.g. V1 ->V2 means you can go from V1 to V2 but not the other way. Undirected Graphs is then, of course, a graph where it does not matter which way you travel. Directed and undirected graphs can also be weighted or not. This means that “it takes a certain amount of energy” to travel from one vertex to another.

simple directed weighted graph
This comes clear when we think back to the 2 dimensions and that a graph can be points/vertices on a surface. If we were looking to connect three vertices using the least amount of “energy” and vertex v1 is situated a distance of 9 meters from vertex V2 and vertex V1 is situated 4 meters from vertex V3 and vertex V2 is situated 3 meters from vertex V3, we would connect them as such: V1-V3-V2

Euclidian weighted graph
Minimum Spanning Tree
Having scratched the surface of Graphs it is now a bit easier to understand what a minimum spanning tree is.
In the “Algorithms in Java” book the minimum spanning tree(MST) is defined as:
In a weighted graph, find a minimum-weight set of edges that connects all the vertices.
It is also defined as:
A minimum spanning tree of a weighted graph is a spanning tree whose weight(the sum of weights of it’s edges) is no larger than the weight of any other spanning tree.
It is good to dwell on that last definition for a second. It says that there might be lots and lots of spanning trees in a graph, and that some of them can have the same combined weight. According to the definition this is okay, the more the merrier, as long as it is still a minimum spanning tree.
This means, that from any vertex you should be able to travel to any other vertex, not necessarily along the shortest path, that is another algorithm altogether, but all the vertices is connected in a way that adds up to the combined shortest edges.
There are quite a few algorithms that strives to solve this problem, two of them is Prim’s and Kruskal’s algorithm.
Prim’s algorithm, it goes under many different names as it has been discovered and implemented by more than one person, but that is a story for Wikipedia and not this very short post.
Prim’s algorithm looks at a graph, selects an arbitrary vertex and starts testing all the edges connected to this vertex, once it finds the shortest edge, the edge with the smallest weight, it adds this to the tree, e.g. V12-V65. It then goes on to look for the next shortest edge from either V12 or V65. Before the algorithm can add another edge it must test that it is not creating a cycle. A cycle is if V1-V2 and V2-V3 and V3-V1 is connected, meaning you could start out from V1 and then end up in V1 again. A cycle is redundant in a minimum spanning tree and not desired.
If the algorithm consists of N vertices the MST will be finished when N-1 edges has been added to the MST, regarding no cycles has been made.
The Kruskla’s algorithm works in a very similar way, the main difference is that Kruskal’s algorithm starts at the first vertex, i.e V1 and then works it’s way outwards from that.
This means that an identical graph will yield the same MST each time when using Kruskal and that Prim’s can yield different MST’s depending on which vertex was set as the starting point(remember the second definition…)
Here is the Processing code for the sketch shown at the beginning
There are 101 vertices, this adds up to 5050 possible edges, they are drawn in using a transparant white. The MST will be the red path as in the picture in the start of the post.
14/10/2009 at 20:35 Permalink
I tried to compile and run the code in Processing 1.0.7 on OSX 10.5 and I get the error: “the nest type GraphBuilder cannot hide an enclosing type”
14/10/2009 at 22:45 Permalink
Hi Erik
I did this almost a year ago in a pre 1.0 version of processing so Im a bit uncertain on the details:)
But I guess you copy pasted the code from the example and maybe somehow a { or } or a class/tab went wrong.
I could easily be something along the lines of this : Processing forum thread
This was done, simply to understand the algorithm, so it’s not polished in any way.
If you get it up and running there is lots of room for improvement, feel free to get back with Your changes and I’ll post them.
Ricki