List Processing: Sort Again, Naturally

Timothy J. Rolfe
Professor
Computer Science Department
Eastern Washington University
202 Computer Sciences Building
Cheney, Washington 99004-2412 USA
http://penguin.ewu.edu/~trolfe/
mailto:Timothy.Rolfe@mail.ewu.edu


Published in inroads (the ACM SIGCSE Bulletin), Vol. 37, No. 2 (June 2005), pp. 46-48.
Click here to access the PDF file containing the article as published.


This is available as a Word 6.0/95 document in the format specified for ACM SIGCSE inroad submissions.

Abstract

This discusses a possible student project for use within the Data Structures and Algorithms treatment of linked lists. Students can explicitly compare the recursive list-oriented MergeSort algorithm with iterative list-oriented MergeSort algorithms (with O(n) space overhead) including the "Natural MergeSort." The author’s experimental results are shown for implementations in C and in Java.

Introduction

When one is processing a linked list, Sedgewick notes that the recursive (top-down) MergeSort is easy to understand and implement. [1, p. 354] [2, p. 367] Immediately following his discussion of the recursive algorithm, he also discusses an iterative, bottom-up implementation involving queues.

For my bottom-up implementation in lecture, though, I find it useful to cover Sahni’s iterative implementation of array MergeSort. [3, pp. 680-81] The O(n log n) nature of algorithm is blindingly obvious. As the sorted-segment size repeatedly doubles, we have an obvious O(log n) outer loop, while the inner loop, transferring all of the data from one array to the other, obviously requires O(n) operations. Indeed, we can make the stronger statement of a Q (log n) outer loop and Q (n) inner loop for an over-all Q (n log n).

This use of an array suggests an alternative implementation to the bottom-up implementation with queues. While the top-down MergeSort is a stable sorting algorithm (preserving existing order in the final result), the queue-based bottom-up implementation is not stable: if the initial set of queues contains an odd number of queues, it is easy to see that the last queue of the initial set is merged as to the left of the front of the original list, and that this mixing can potentially continue throughout the successive dequeues, merges, and enqueues. [4]

The implementation of the iterative MergeSort, to mimic the queue-based implementation, starts with lists of size one. After that, it takes only a small modification to implement the Natural MergeSort, which takes advantage of existing order in the list. [5, pp. 160-61] [1, pp. 355-56] [2, p. 368] In his discussion, Sedgewick offers the opinion that for random files the Natural MergeSort may even perform more poorly than the top-down implementation because of the O(n) check of existing order in setting up the initial set of lists. Students in their projects can experimentally test Sedgewick’s suggestion.

Iterative Implementation

The iterative merge sort for arrays can be applied to linked lists with an O(n) space overhead.  It uses a work array of pointers to lists (or of references in the Java environment).  In addition, it contains one extra null reference, so that when the number of lists to merge is odd, the last merge is of an existing list with an empty list. In this fashion, the sort becomes a stable sorting method because the segments in their merging are retained in their original order.

Here are the code segments reflecting the generation of the initial array of one-element lists and the subsequent merging of those lists until only one remains. It is shown as C code, but the Java incarnation only requires the replacement of "->next" with ".next". Further, in my Java implementation the merge method is a static method of the class ListNode, so that invocation is "ListNode.merge" rather than "merge."

  1. Traverse the list, breaking it apart into n lists of one entry each.
  2.    // Note:  n must be the correct size of the list
          for ( k = 0; k < n; k++ )
          {
             work[k] = head;
             head = head->next;
             work[k]->next = NULL;
          }
          work[k] = NULL;       // Extra null reference for odd cases
  3. Repeatedly go through the array, merging pairs of lists until only one list remains.  In a case in which there is an odd number of lists, the last list is merged with an empty list.  To handle this correctly, the new logical size of the array must be based on truncation upward on division of the old logical size by two.  In addition, the cell following the new array is set to null in case the new logical size is odd.  (Presumably the overhead to test for an odd number is comparable with the overhead simply to store the null reference into an array cell.)
      for ( lim = n ; lim > 1 ; lim = (lim+1)/2 )
      {
         for ( j = k = 0; k < lim; j++, k += 2 )
            work[j] = merge ( work[k], work[k+1] );
         work[j] = NULL;     // Extra null reference for odd cases
      }

Natural MergeSort Implementation

In the Natural MergeSort algorithm, the generation of the array of list heads is performed by traversing the existing list and breaking that list into its ordered segments.  That is, if child = parent->next (in Java, child = parent.next), wherever parent->data > child->data (in the equivalent Java implementation, parent.compareTo(child) > 0) start a new list (and close out the old list).  Thus, if the list coming in is already sorted, only one list is generated and we have an W(n) algorithm.  The worse case (a reversed list) simply reverts to the one-node list case, with the added overhead of confirming that.  The hope is for the random case the reduction in the initial number of lists will justify the overhead of generating those lists. One can prove that there are on average (n-1)/2 sequence breaks in a length-n data set. [6]

The first loop changes into one that traverses the list, breaking it apart into an initially unknown number of sorted lists. The worst case will be that we end up with one-element lists and lim == n. The best case, of course, is that there is only one list at the end.

   for ( lim = 0; head != NULL; lim++ )
   {
      NodePtr parent = head, // Find the control break
              child;

      work[lim] = parent;
      child = parent->next;
      
      while ( child != NULL && parent->data <= child->data )
      {
         parent = child;
         child = parent->next;
      }
      head = child;
      parent->next = NULL;
   }
   work[lim] = NULL;         // Extra null reference for odd cases

Supplemental Comparison: QuickSort

Once we have transformed the linked list into an array of nodes or pointers to nodes, that array can simply be sorted by one’s favorite array-oriented algorithm. As a supplemental test, such arrays were also generated and then sorted by an optimized QuickSort (using median-of-three partitioning [1, p. 319-24] and insertion sorting for array segments with fewer than 20 elements). This implementation of QuickSort performs the partition operation through a function call rather than having that code in-line. The number of function calls is reduced by removing the tail-recursion that sorts the right-hand side of the current array segment.

Driving Main Program

In my use of this project, I treat it as a maintenance-programming assignment: the students are given program segments that they may not alter as well as the specifications for the procedures that they do need to generate. That is because I use it within the context of a second Data Structures course, so that the students presumably were well exposed to linked list implementations in the first course and I am simply having them exercise an alternative algorithm. It could easily be used within the context of the first Data Structures course when students are developing linked list procedures so that they can use modules that they have developed earlier in the course.

By giving the students the main program to exercise their code, I can also provide a tool to aid them in their experimentation. The time required to sort equivalent sets of data is captured in "Comma-Separated Value" format, so that the students can very easily bring their data into a spreadsheet program and use its tools (including the graphing tools) to examine their results.

The algorithm used to generate the random data sets is discussed in [7].

Complete Implementations

Both the C and the Java implementations are available through the World-Wide Web.

Connect to http://penguin.ewu.edu/~trolfe/NaturalMerge/index.html

Execution Environment

The implementations were run on Xeon processors in DELL quad-processor computers under the DELL-installed Red Hat Linux operating system. The C programs were compiled using the gcc compiler (version 3.2.2), while the Java programs were compiled and run in the Java™ 2 Runtime Environment, Standard Edition (build 1.4.2_01-b06). While the Xeon is rated as 3.0 GHz, that is the result of hyperthreading two 1.5 GHz processors, so that the Linux systems sees those two processors as running at 1.5 GHz.

Experimental Results

In my implementations (both in C and in Java), I do not confirm Sedgewick’s suggestion. Rather, for random data sets I find a definite speed-up for the Natural MergeSort as compared with the top-down recursive implementation. For the C implementation I find that the Natural MergeSort runs about 15% faster, while the Java implementation runs about 9% faster. Indeed, the iterative one-element implementation even provide some speed-up compared with the recursive implementation under Java (at times over 4%), and it actually runs about 10% faster under C. As usual, the C implementation runs significantly faster than its Java counterpart.

The performance of the QuickSort algorithm in the Java environment may indicate that for the insertion sort optimization of QuickSort, a cut-off higher than 20 would be appropriate.


Figure 1: C Experimental Results


Figure 2: Java Experimental Results

Acknowledgements

The programs were run on computers acquired as part of the "Technology Initiative for the New Economy" (TINE) Congressional grant to Eastern Washington University that, among other things, provided a parallel and distributed processing resource — which these computers do admirably well! Each DELL is effectively an eight-processor SMP, so that among the five machines there are 40 processors available for distributed processing.

Dr. Ray Hamel, chair of the Computer Science Department at Eastern Washington University, provided the proof given in [6].

The results were analyzed and graphed under MicrosoftÓ Excel 2002.

References

  1. Sedgewick, Robert. Algorithms in C; Parts 1-4. Third edition: Addison, Wesley, 1998.

  2. Sedgewick, Robert. Algorithms in Java; Parts 1-4. Third edition: Addison, Wesley, 2003.

  3. Sahni, Sartaj. Data Structures, Algorithms, and Applications in C++. McGraw-Hill, 1998.

  4. This very feature that makes the queue-based mergesort unstable is also the feature that makes it an optimal implementation of mergesort.  I am grateful to Robert Sedgewick for sending me the following reference:  Golin, Mordecai J., and Sedgewick, Robert. "Queue-mergesort", Information Processing Letters, Vol. 48, No. 5 (10 December 1993), pp. 253-259.  In addition, they end their article with a simple modification to the successive merge logic (which they credit to an unknown referee) that would make the bottom-up optimal and preserve stability:  for odd cases, merge the final three sublists.

  5. Knuth, Donald. The Art of Computer Programming; Volume 3: Sorting and Searching. Second edition: Addison-Wesley, 1998.

  6. Proof provided by Dr. Ray Hamel of my department:  There are on average (n–1)/2 sequence breaks in a sequence of n randomly selected distinct items.  Each sequence S of n items [where S = (S1, . . . , Sn)] has its reverse sequence R such that Rj = Sn-j+1.  Both sequences together have n–1 sequence breaks:  for each adjacent pair of (distinct) values, the two values are out of order either in R or in S (but not in both), and there are n–1 adjacent pairs.  For randomly selected sequences, both sets are equally likely, so that there are (n–1)/2 sequence breaks on average.

  7. Rolfe, Timothy. "Algorithm Alley: Randomized Shuffling", Dr. Dobb’s Journal, Vol. 25, No. 1 (January 2000), pp. 113-14