Generation of a Minimum Spanning Tree
Without Traversing the Graph

Kruskal’s Algorithm

This material is predominantly taken from Sartaj Sahni, Data Structures, Algorithms, and Applications in C++ (WCB McGraw-Hill, 1998), pp. 646-52.  [Data Structures, Algorithms, and Applications in Java (WCB McGraw-Hill, 2005), pp.  726-31]  Kruskal’s algorithm is based on adding edges to a forest of spanning trees in a way that does not generate cycles.  It is based on maintaining sets of trees: each vertex is initially in its own set of size one. All the edges are processed in non-decreasing order — so this is an example of a “Greedy” algorithm.  Edges that do not generate a cycle (that is, do not connect vertices in the same set) are added to the tree.  The two sets containing the two vertices are then joined into one set.

Major expense:  processing the edge list in non-decreasing order.  If there is a large number of possible weights, then comparison-based sorting of the edges will require O( ||Edges|| log ||Edges|| ) — where ||Edges|| represents the “cardinality” of the edges, which is just a fancy way of turning “how many edges there are” into symbolic form.

If there is a small number of possible weights, the edge sorting can be done in approximately O(||Edges||) time using a radix sort or even (if a linked list is used for the edge list) a bucket sort.  For the bucket sort, generate an array of list headers, one for each allowed edge weight.  Deal the edges into the “buckets” (that is, link them into their respective lists), and then chain the lists together into a single list.  Time required is O( ||Edges|| + ||Number of allowed values|| ).

For that matter, if the number of edges is significantly larger than the number of vertices, one might consider using a MinHeap — the construction phase is then O(||Edges||); the removal phase would have the expense of about ||Vertices|| log ||Edges||

Kruskal’s algorithm makes heavy use of set operations “find” and “union”.  Optimally implemented, the “find” and “union” operations on sets grow very slowly with the number of elements in the disjoint sets — while the time complexity does grow somewhat with increasing size, it is very slow and the algorithms are very close to O(1) operations.

Set Union/Find Implementation Discussion

Java implementation of Kruskal — within an “extends CGraph”.
Find implementation is the optimized one.
Union implementation available here.

public void Kruskal()
{  int row, col;

   ArrayList work = new ArrayList();
   PriorityQueue pq;
   Set set = new Set(m_Size);     // Embedded class below

   clearFlags();
   for ( row = 0; row < m_Size; row++ )
   // the undirected graph is stored in directed fashion, so
      for ( col = row+1; col < m_Size; col++ )
         if ( m_Road[row][col] != null )
            work.add(new WtEdge(new CPair(row, col),
                     m_Road[row][col].m_Distance));
   pq = new PriorityQueue(work);

   // For N vertices, N-1 edges
   while ( m_NextEdge < m_Size-1 && !pq.isEmpty() )
   {  WtEdge current = pq.remove();
      CPair  edge = current._item;
      int    src  = edge.m_First,
             dst  = edge.m_Second;
      int    set1 = set.find(src),
             set2 = set.find(dst);

      if ( set1 == set2 )
         continue;

      m_Sequence[m_NextEdge++] = edge;
      set.union(set1, set2);
   }
}

Processing of Edges in Kruskal’s Algorithm

Accept A ==> B, wt. 1
Accept L ==> M, wt. 1
Accept J ==> M, wt. 1
Accept E ==> G, wt. 1
Accept D ==> E, wt. 1
Accept J ==> K, wt. 1
Accept A ==> F, wt. 1
Accept D ==> F, wt. 1
Accept A ==> C, wt. 2
Reject E ==> F, wt. 3 --- cycle (E D F E)
Reject A ==> G, wt. 5 --- cycle (A F D E G A)
Reject J ==> L, wt. 5 --- cycle (J M L J)
Accept H ==> I, wt. 137
Disconnected components.
One includes vertex E
One includes vertex H
One includes vertex L

BACK


Worked examples:
A-G-Demo.doc   A-G of above example done in steps, showing set trees, one per page.
A-G-Demo.pdf    Same as a PDF — much easier to view from a browser.  Navigate by <Page Down/Up>, possibly <Left/Right Arrow>
WikiMST.html     Worked example from the Wikipedia page on Kruskal.   Picture of the graph    Text file for the graph
Traces.doc           Class hand-out:  Prim’s, Dijkstra’s, and Kruskal’s algorithms traced on Test0, Test5 and Test2 data sets
Traces.pdf            Formatted to print in two-column format on legal size paper — prints two-sided on two sheets.
Test2.doc             Alternative working of the Test2 data set, showing both Prim’s algorithm and Kruskal’s algorithm results pictorially.
Just the pictures
8_13_Example     Using a figure from Sahni’s web site for an example — eight vertices and thirteen edges


Java Source Files

Note:  the implementation of Kruskal() within GraphK.java includes some debugging code not shown in the hand-out:  output of each edge as it is accepted or rejected.

GraphK.java        public class GraphK extends class CGraph — adds Kruskal method   (Txt file)
CGraph.java
        Class CGraph — slightly modified due to changes in GrUtility.java   (Txt file)
GrafTravK.java    Driving main program — only does Prim and Kruskal   (Txt file)
Kruskal.jar           Executable jar file with the source code as well.

MyInpLib.java      Assorted I/O routines used by CGraph.java   (Txt file)

Kruskal.doc          Class hand-out with code example and specimen run
PAMS_1956.pdf  Kruskal's paper in the Proceedings of the American Mathematical Society, in a PDF wrapper.
http://www.jstor.org/stable/2033241   Source of the above:  clicking the PDF button in the top right of the screen from on campus