Home Previous Next

CSC205::Lecture Note::Week 12
Assignments | Code | Handouts | Resources | Email Thurman | Tweets::@compufoo
GDT::Bits:: Time  |  Weather  |  Populations  |  Special Dates

Overview Assignment(s):  Code Bsearch.java | BubbleSort.java | CountingSort.java | RadixSort.java | InsertionSort.java {simpler code}
MergeSort.java | AbstractSorter.java | Comparable.java | Sortable.java

Introduction to Sorting

Typically, sorting is executed on files that contain records having keys. The keys are used to control the sort.

The objective of a sort is to rearrange records so that their keys are ordered according to some well-defined ordering rule (e.g. numerical order or alphabetical order).

The sorting problem is defined as follows:

   Given a set of records r1, r2, ..., rn with key 
   values k1, k2, ..., kn, respectively, arrange the
   the records into any order  s  such that
   records rs1, rs2, ..., rsn
   have keys obeying the property 
   ks1 <= ks2 <= ... <= ksn.

If the file being sorted can fit into memory, then an internal sort can be done; otherwise, sorting files from a disk (or some other secondary storage device) is called external sorting.

For discussion purposes, we will focus on internal sorting (more specifically, the sorting of arrays).

A characteristic of sorting methods is stability. A sort is stable if it preserves the relative order of equal keys in a file; otherwise, it is non-stable. For example, if an alphabetized class list is sorted by grade, then a stable method produces a list in which students with the same grade are still in alphabetical order, but a non-stable method is likely to produce a list that is no longer in alphabetical order.

The efficiency of a sorting algorithm depends on two measurements:

The amount of extra memory used by a sort is important. Extra memory is used by sort in the following ways:

In general, if the records being sorted are large, then avoid "shuffling" them around by rearranging the elements of an array of pointers (or indicies). This is called indirect sorting.

Swapping adjacent records (i.e. items) is called an exchange.

There are numerous "simple" sorting algorithms are based on comparing and/or exchanging adjacent records -- they are: bubbleSort, exchangeSort, selectionSort, and insertionSort.

There are numerous sorting algorithms that might be better than the aforementioned "simple" sorts. Some of these sorts are: shellSort, mergeSort, and quickSort.

Sorting and Searching: The Art of Computer Programming by Donald E. Knuth is a classic text on the subject of sorting. It covers about 25 different sorting algorithms. The following a quote from the book:

"But you can't look up all those license numbers in time," Drake objected. "We don't have to, Paul. We merely arrange a list and look for duplications." -- Perry Mason (The Case of the Angry Mourner, 1951)

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


BubbleSort

BubbleSort is an exchange sort that is inefficient, but easy to to understand.

Keep passing through the list, exchanging adjacent elements, if necessary; when no exchanges are required on some pass, the file is sorted.

  1. begin with elements 0 and 1 ('i' and 'j')
  2. compare two adjacent elements 'i' and 'j'
  3. if element 'i' less than element 'j', swap them
  4. move one position to the right

When sorting in ascending order, the largest element "sinks" to the end of the list and the smallest values "bubble up" towards their proper positions near the front of the list.

BubbleSort pseudo-code:

   // assume we have an array 'a' with length 'n'

   end_i = n-1;
   while (end_i > 0) {
      last_i = 0;
      for (next_i = 0; next_i < end_i; next_i++) {
         if (COMPARE (a, next_i, next_i+1)) {
            SWAP (a, next_i, next_i+1);
            last_i = next_i;
         }
      }
      end_i = last_i;
   }

BubbleSort is terrible [O(n^2)] if the list is sorted in reverse order. It works great given a list that is already sorted.

YouTube.com::Bubble Sort by CS50 at Harvard

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


InsertionSort

In most cases, InsertionSort is the best of the simple sorts (i.e. BubbleSort, ExchangeSort, and SelectionSort).

InsertionSort works well if only a few elements need sorting; however, it is a poor choice if dealing with a large number of elements.

Often times InsertionSort is used as the final stage of more advanced sorts (e.g. QuickSort).

For an example of InsertionSort, assume we have an array A and its length is N. Let's look at pass i (1 <= i <= N-1). The sublist A[0] to A[i-1] is already sorted in ascending order. The pass assigns A[i] to the list. Let A[i] be the TARGET and move down the list, comparing the target with items A[i-1], A[i-2], and so forth. Stop the scan at the first element A[j] that is less than or equal to TARGET or at the beginning of the list (j=0). As we move down the list, slide each element to the right (A[j]=A[j-1]). When the correct location for A[i] is found, insert it at location j.

Said another way.

Assume elements are being collected using array named 'a'. The initial state is that the first element, considered by itself, is sorted. Each time a new element with value 'n' is "added" to the collection, it is inserted into a slot 'i' such that a[i-1] <= a[i] <= a[i+1]. If the new element is inserted into the array (i.e. it is not the last element), then the elements to the right of the insertion point must be shifted.

Many card players use the insertion sort. They are dealt a hand of cards. Each time they receive a card they place it in their hand in a sorted order. Luckily for them, shifting cards is simple. The following is an example of a card player that stores their cards in ascending order (lowest-to-highest).

   start with an empty hand
   receive an 8:  8                 //1 element is always sorted
   receive a  4:  4 8               //4 is inserted, the 8 was shifted
   receive a 10:  4 8 10            //10 added (no shifting necessary)
   receive a  2:  2 4 8 10          //4,8,10 all shifted
   receive a  7:  2 4 7 8 10        //8,10 shifted

InsertionSort pseudo-code.

   // assume we have an array 'a' with 'n' elements

   for (cur_i = 1; cur_i < n; cur_i++) {
      next_i = cur_i;
      while (next_i > 0) {
         if (COMPARE (a, next_i, next_i-1))
            SWAP (a, next_i, next_i-1);
         next_i--;
      }
   }

YouTube.com::Insertion Sort by CS50 at Harvard

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


ExchangeSort

ExchangeSort sort is an exchange sorts that works as follows: The item at index 0 is compared with each subsequent item in the list at index 1, 2, ..., N-1. For each comparison, if the subsequent item is smaller than the element at index 0, the two entries are exchanged. Repeat this process with index 1, index 2 and so on up to index N-2.

ExchangeSort pseudo-code:

   // assume we have an array 'a' with 'n' elements

   for (cur_i = 0, limit = n-1; cur_i < limit; cur_i++) {
      for (next_i = cur_i+1; next_i < n; next_i++) {
         if (COMPARE (a, next_i, cur_i))
            SWAP (a, next_i, cur_i);
      }
   }

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


SelectionSort

SelectionSort is a simple sorting algorithm that makes a number of passes through a list (or sublist) and, on each pass, selects one element to be correctly position. Each element is moved only once; therefore, this type of sort can be handy when sorting files with very large records and small keys.

The item at index 0 is compared with each subsequent item in the list at index 1, 2, ..., N-1. The index of the smallest item encountered is saved. After the comparisons have been completed, the smallest item is swapped with the item at index 0. Repeat this process with index 1, 2, and so on up to N-1.

SelectionSort pseudo-code:

   // assume we have an array 'a' with 'n' elements

   for (cur_i = 0, limit = n-1; cur_i < limit; cur_i++) {
      small_i = cur_i;
      for (next_i = cur_i+1; next_i < n; next_i++) {
         if (COMPARE (a, next_i, small_i))
            small_i = next_i;
      }
      if (small_i != cur_i)
         SWAP (a, cur_i, small_i);
   }

YouTube.com::Selection Sort by CS50 at Harvard

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


ShellSort

ShellSort was discovered by a computer scientist named Donald Shell in 1959. It is also referred to as the diminishing gap sort.

It is based on InsertionSort, but adds a new feature that improves performance. The basic idea behind this sort is that in early stages, far-apart elements are compared, rather than adjacent ones as in simplier exchange sorts. This tends to eliminate large amounts of disorder quickly, so later stages have less work to do. The interval (i.e. gap) between compared elements is gradually decreased to one, at which point the sort effectively becomes an exchange method. The performance of this algorithm can be influenced by the gaps that are used.

ShellSort is good for medium-sized arrays (few thousand items). By default, ShellSort should probably be your choice for sorting. If performance becomes an issue, then change to a more advanced sorting technique such as QuickSort.

ShellSort pseudo-code:


   // assume we have an array 'a' with length 'n'

   for (gap = n/2;  gap > 0;  gap /= 2) {
      for (i = gap;  i < n;  i++)
         for (j = i-gap;  j >= 0;  j -= gap) {
            if (!COMPARE (a, j, j+gap))
               break;
            SWAP (a, j, j+gap);
         }
   }

YouTube.com::Java Shell Sort by Derek Banas

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


QuickSort

QuickSort  was first developed by C.A.R. Hoare. It requires the picking of a partition element (pivot) and then partially sorting the array about the partition. You can then sort each of the two partitions by recursive application of the same technique. The algorithm can sort quite rapidly, but it can also sort very slowly [O(n^2) worst case].

QuickSort is a fast divide-an-conquer algorithm when comparison sorting arrays in place (frequently O(n log n)).

The basic algorithm for QuickSort is recursive and consists of the following steps:

  1. If the number of elements is 0 or 1, return.
  2. Pick any element and call it the pivot.
  3. Partition remaining elements into two disjoint groups called L and R.
  4. Return the result of QuickSort(L), followed by the pivot, followed by QuickSort(R).
The pivot divides array elements into two groups: those smaller than the pivot and those larger than the pivot.

The partition step places every element except the pivot in one of two groups.

   13  81 43  92  31  65  57  26  75  0

   Assume 65 is selected as the pivot.  Partition the elements.

    Left:  13  43  31  57  26  0
   Right:  81  92  75

   Quicksort the left items.  Quicksort the right items.
   0  13  26  31  43  57      75  81  92
                          65
Example:
   Original list:
      8  1  4  9  6  3  5  2  7  0

   Select the pivot  6  and swap the pivot with the last element.

      8  1  4  9  0  3  5  2  7  6

   Run   i  from left-to-right, and  j  from right-to-left.
   When  i  sees a large element,  i  stops.  When  j  sees
   a small element,  j  stops.  If  i  and  j  have not
   crossed, swap their items and continue; otherwise, stop
   this loop.

      i  stops at element 8; j stops at element 2
      the 8 and 2 are swapped

   2  1  4  9  0  3  5  8  7  6

      i  stops at element 9; j stops at element 5
      the 9 and 5 are swapped

   2  1  4  5  0  3  9  8  7  6

      i  stops at element 9; j stops at element 3
      i and j have crossed -- swap pivot with
      i'th element (6 with 9)

   2  1  4  5  0  3  6  8  7  9
Picking the pivot is important to good performance. Never choose the first element as the pivot (you should also avoid the last element). The middle element is usually a reasonable choice. The perfect pivot is one that results in equally sized partitions. Picking a random pivot has proven efficient.

The  median-of-three  is a popular technique for picking a pivot. Sort the first, middle and last elements of the list and use the median (in this case middle) value. Using this technique is good because while in the process of determining the pivot, you are also getting started on the sorting of the list (element 0 will be less than the pivot and element N-1 will be greater than the pivot).

QuickSort pseudo-code:

   quickSort(T a[], int left, int right) 

      if (left >= right) return;
      SWAP (a, left, (left+right)/2);
      last = left;
      for (i = left+1; i <= right; i++) {
         if (COMPARE (a, i,  left) {
            ++last;
            SWAP (a, last, i);
         }
      }

      SWAP (a, left, last);

      quickSort(a, left, last-1);
      quickSort(a, last+1, right);

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


YouTube::UCBerkeley::CS61B::Lecture 29: Sorting I | Lecture 30: Sorting II | Lecture 32: Sorting III

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


Big-O Notation

The run-time efficiency of a program has two main ingredients: space and time.

Space efficiency is a measure of the amount of memory a program requires.

Time efficiency is a measure of how long a program takes to execute.

In general, there is a trade-off between space and time. If your algorithm uses more space, then typically it will execute faster and vice versa.

The relationship between N (where N is a variable representing the number of inputs on which your algorithm must operate) and the running time of an algorithm as N becomes large is called the computational complexity of that algorithm.

Computational complexity is denoted using BigO notation. The letter O stands for the word order as it is used in the phrase on the order of. It signifies approximation.

Given two functions f and g, the function g is said to dominate function f if there is some positive constant c such that:

   c * g(x) >= f(x), for all possible values of  x

Asymptotic dominance is a variation of dominance such that:

   c * g(x) >= f(x), for all  x >= x'

In other words, function g asymptotically dominates function f for all "large" values of x. Big-O notation is represented using asymptotic dominance.

          5N   versus     N2
   -------------------------
   N=1:   5    versus     1
   N=2:   10   versus     4
   N=3:   15   versus     9
   N=4:   20   versus     16
   N=5:   25   versus     25
   N=6:   30   versus     36
   N=7:   35   versus     49

      In this example, x'  is equal to 6 and the constant is 1.

The following are some examples of how dominance is used to determine BigO values:

   O(N3) + O(N2)  =>  O(N3)
   
      As  N  gets large, N cubed grows at a faster rate than
      does  N squared.  For large values of N, N cubed dominates.
      
   O(10 * N) + O(1000 * N) + O(1)   =>  O(N)
   
      Constant values are ignored.  The significance of the
      constant values are minimized for large values of N.
      
   O(N log N) + O(log N) + O(N2)  =>  O(N2)
   
      N squared dominates  N log N  which in turn dominates  log N.

[About logarithms.]

Let's review:

BigO-notation is used for three distinct purposes.

  1. To bound the error that is made when we ignore small terms in mathematical formulas.
  2. To bound the error that is made when we ignore parts of a program that contribute a small amount to the total being analyzed.
  3. To allow us to classify algorithms according to upper bounds on their total running times.

Categories of Common Running Times
O(1) constant time (most efficient)
O(N) linear time (polynomial time)
O(N log N) "linearithmic" time (polynomial time)
O(N2) quadratic time (polynomial time)
O(N3) cubic time (polynomial time)
O(log N) logarithmic time
O(2N) exponential time (impractical for large values of N)

The following table presents growth rates for some of the common functions (i.e running times):

    N     log N         N   N log N      N**2          N**3          2**N
=========================================================================
    1         0         1         0         1             1             2
    2         1         2         2         4             8             4
    4         2         4         8        16            64            16
    8         3         8        24        64           512           256
   16         4        16        64       256          4096         65536
   32         5        32       160      1024         32768   4.29497e+09
   64         6        64       384      4096        262144   1.84467e+19
  128         7       128       896     16384       2097152   3.40282e+38

Typically constant factors are ignored in BigO analysis (although in reality constants frequently matter).

To compare the speed of different algorithms, computer scientists use bigO notation. The bigO notation doesn't exactly predict performance times. Instead, what the bigO notation tells us is that given two algorithms with the same order, then both will slow down at roughly the same rate as the number of items being sorted gets larger.

Search algorithms also have bigO values. Linear search is O(n), binary search is O(log n). Once again, these differences are most important for large data sets. If you have 1 million items in an array, a linear search will require about 1 million tests, but a binary search will only require 20 tests (because 2 to the 20 power is about 1 million -- the 'log' in bigO notation is always log-base-2 -- when you have a logarithmic algorithm, the analysis need not be concerned with the base of the logarithm, since this can change the total running time only by a constant factor (and constant factors are ignored).

   for (i = SOME_CONSTANT; i <= ANOTHER_CONSTANT; i++)      // O(1)
      // loop body requiring constant time
      
   for (i = SOME_CONSTANT; i <= N; i++)                     // O(N)
      // loop body requiring constant time
      
   for (i = SOME_CONSTANT; i <= N; i++)                     // O(N*N)
      for (j = ANOTHER_CONSTANT; j <= N; j++)
         // loop body requiring constant time

   for (i = SOME_CONSTANT; i <= N; i++)                     // O(N*M)
      // loop body requiring time O(M), where M is a variable

The following is example of how one can go about determining the approximate running time of an algorithm.

    // the following logic initializes a 'n' element array
    
    index = 0;                // assignment is 1 step
    while (index < n) {       // EXPR  index < n  is 1 step
       array[index] = -1;     // assignment is 1 step
       index++;               // increment is 1 step
    }

             Si + Sc*(n+1) + Sb*n
   n = 0:    1  + 1*1      + 2*0    =  2
   n = 1;    1  * 1*2      + 2*1    =  5
   n = 2;    1  * 1*3      + 2*2    =  8
   n = 3:    1  + 1*4      + 2*3    =  11
   n = 6:    1  + 1*7      + 2*6    =  20
   n = 9:    1  + 1*10     + 2*9    =  29
   n = 300:  1  + 1*301    + 300*2  =  902
   
      Si => # of initialization steps
      Sc => # of steps in the condition EXPR
      sb => # of steps in the loop body
   
   Conclusion:  linear -- O(n)

Analysis of a Couple of Code Snippets

   for (k = 1; k <= n / 2; k++) {
      ...
      for (j = 1; j <= n * n; j++) {
         ...
      }
   }

Since these loops are nested, the number of times statements within the innermost loop are executed is the product of the number of repetitions of the two individual loops. Hence the efficiency is n3, or O(N3) in BigO terms, with a constant of proportionality equal to 1/2.

   for (k = 1; k <= n * 10; k++) {
      ...
   }

   for (j = 1; j <= n * n * n; j++) {
      ...
   }

Since one loop follows the other, the number of operations executed by both loops is the sum of the individual loop efficencies. Hence the efficiency is 10n + n3, or O(N3) in BigO terms.

Analysis of Exchange Sort

Assume we have array A with length N and we want to find the minimum element in the array.

   Pass 1:  compare the  n-1  elements A[1]...A[n-1] 
            with A[0] and, if necessary, exchange elements
            so that A[0] always has the smallest value

   Pass 2:  compare the  n-2  elements A[2]...A[n-1] 
            with A[1] and, if necessary, exchange elements
            so that A[1] less than all elements to the right

   Pass i:  compare the  n-i  elements A[i]...A[n-1]
            with A[i-1] and, if necessary, exchange elements 

   The total number of comparisons is given by the
   arithmetic series  f(n) from 1 to n-1:
      
      f(n) = (n-1) + (n-2) + ... + 3 + 2 + 1 =  n * (n-1) / 2

   The number of comparisons depends on  n2.

Using an Array for a List

Append an item: O(1). Insert an item at the front of the array: O(N).

Simple Sorts

Bubble, selection, insertion, and exchange are all O(N2).

Shell sort can give much better performance than O(N2) in the worst case.

Quick sort in the worst case can be O(N2), but on average it is O(N log N).

Merge sort is an O(N log N) sort.

Order of a Function: Mathematical Definition

The order of a function is defined as follows:

Given two non-negative functions f and g, the order of f is g if and only if g asymptotially dominates f. Using BigO notation, this can said as: f = O(g) (f is of order g).

Rules for Comparing BigO Estimates

Some rules to help calculate and compare BigO estimates:

   Let  f  and  g  be functions, and  C  a constant:
   
      O(C * f) = O(f)
      O(f * g) = O(f) * O(g)
      O(f / g) = O(f) / O(g)
      O(f) > O(g), if and only if  f  dominates  g
      O(f + g) = Max[O(f), O(g)]

About Logarithms

A logarithm of base b for the value y is the power to which you must raise b to get y. Normally, this is written as: logby = x.

   logby = x  <=>  bx = y  <=>  blogby = y

      where <=>  means "is equivalent to"

Logarithms have the following properties:

    For any positive values of  m,  n, and  r, and any positive integers
    a  and  b:
    
      log nm = log n + log m
      log n/m = log n - log m
      log nr = r log n
      logan = logb n / logba

YouTube.com::Berkeley.edu::CS61B::Lecture 19: Asymptotic Analysis

{TopOfPage} {Oracle.com::API Specification | Tutorial} {Derek Banas Videos} {Data Structure Visualizations}


Home Previous Next