CScD-327 Programming Assignment — Spring 2004
Iterative Merge Sort Optimization
Due:  2004 April 23

This is available as a Word file (RTF format) that prints in two pages.

The iterative merge sort for arrays covered in lecture can be applied to linked lists with an O(n) space overhead.  It uses a work array of references to list head nodes.  In addition, it contains one extra null reference, so that when the number of lists to merge is odd, the last merge is of an existing list with an empty list.

  1. Traverse the list, breaking it apart into n lists of one entry each.
       // Note:  n must be the correct size of the list
          for ( k = 0; k < n; k++ )
          {
             work[k] = head;
             head = head.next;
             work[k].next = null;
          }
          work[k] = null;       // Extra null reference for odd cases
      
  2. Repeatedly go through the array, merging pairs of lists until only one list remains.  In cases in which there is an odd number of lists, the last list is merged with an empty list.  To handle this correctly, the new logical size of the array must be based on truncation upward on division of the old logical size by two.  In addition, the cell following the new array is set to null in case the new logical size is odd.  (Presumably the overhead to test for an odd number is comparable with the overhead to simply store the null reference into an array cell.)
          for ( lim = size ; lim > 1; lim = (lim+1)/2 )
          {
             for ( j = k = 0; k < lim; j++, k += 2 )
                work[j] = ListNode.merge ( work[k], work[k+1] );
             work[j] = null;     // Extra null reference for odd cases
          }
      

You are being given the working code for this.  Your task is to develop an improvement called the "Natural Merge."  The list as received may have segments already in proper order.  Take advantage of that!

In the "mergeNatural" method, the generation of the array of list heads is to traverse the existing list and break that list into its ordered segments.  That is, if child = parent.next, start a new list (and close out the old list) when parent.compareTo(child) > 0.  Thus, if the list coming in is already sorted, only one list is generated and we have an W(n) algorithm.  The worse case (a reversed list) simply reverts to the one-node list case, with the added overhead of confirming that.  The hope is for the random case the reduction in the initial number of lists will justify the overhead of generating those lists.  In point of fact, we eliminate one outer-loop pass. *

Here is a list of the available files.  It includes the test-bed main I used to develop the class ListNode and the class List — in case you would find it useful in developing your own expansion of the class List.

Files Available

 List.java  Java program to be updated with the mergeNatural method
 ListNode.java  Class for the nodes used by the class List
 MergeSort.java  Iterative array merge sort included in BenchMerge
 BenchMerge.java  Main program to exercise the class List for hand-in
 TestList.java  Test-bed main used in developing List and ListNode
 BenchMerge.jar  Executable jar file with the instructor's implementation

Implement the mergeNatural method, then use the BenchMerge.main to exercise your implementation and compare it with the other Merge Sort implementations.  In the initial dialog of BenchMerge.main, the program requests four quantities:

Initial array length:  2
Maximum array length:  100
Increment for length:  1
Number of passes:  100000

(These can also be provided as command-line arguments.)  Be sure that you choose a large enough "Number of passes" so that the time required for each array length is more than 0.1 seconds — depending on the speed of the computer on which you are doing your work, this may be a smaller number that I have in the above example.

Prepare a write-up of about one page (either as pure text or as a word-processor file) describing the results that you find.  Note that BenchMerge creates a CSV file that you can open directly in Excel or your favorite spreadsheet.  This means that you can easily generate a graph showing your results to bring into your write-up (if it is a word-processor file).

Prepare a jar file containing (1) your amplified "List.java", and (2) the write-up.  The ListNode.java and the MergeSort.java files used to check out your program will be the ones given above.


* Proof provided by Dr. Ray Hamel:  There are on average (n–1)/2 sequence breaks in a sequence of n randomly selected distinct items.  Each sequence S of n items [where S = (S1, . . . , Sn)] has its reverse sequence R such that Rj = Sn-j+1.  Both sequences together have n–1 sequence breaks:  for each adjacent pair of (distinct) values, the two values are out of order either in R or in S (but not in both), and there are n–1 adjacent pairs.  For randomly selected sequences, both sets are equally likely, so that there are (n–1)/2 sequence breaks on average.


Last Updated:   2004-Apr-06 10:04