Home  Previous  Next 

CSC205::Lecture Note::Week 14
Assignments 
Code 
Handouts 
Resources 
Email Thurman 
Tweets::@compufoo
GDT::Bits::
Time

Weather

Populations

Special Dates
StringTree.java
{output} 
[new]
Trees.java 
BST.java
{output}
A tree is a "nonlinear" structure that consists of nodes and edges (or branches). A tree is a set of nodes and edges that connect the nodes. There is exactly one path between two nodes. A path is a connected sequence of edges.
Arrays, stacks, queues, and linkedlists define collections of objects that are sequentially accessed. These are linear lists: they have first and last elements, and each interior element has a unique successor.
In a nonlinear structure, an element may have multiple successors.
Tree structures are used to represent a "hierarchy" of information. A commonly used hierarchy is an operating system's file system.
A tree structure consists of a set of nodes that originate from a unique starting node called the root.
Some nodes may be considered a parent node which implies they have zero or more children nodes to which they are connected. Every node except the root node has a parent.
The children of a node and children of those children are called descendants, and parents and grandparents of a node are its ancestors.
Nodes are siblings when they have the same parent.
A node that doesn't have any children is called a leaf.
Each node in a tree is the root of a subtree, which is defined by the node and all descendants of the node.
grandfather  +++    father uncle aunt  +++   brother sister grandfather is the root node. grandfather is the parent node to father, uncle and aunt. father, uncle, aunt, brother and sister are all descendants of grandfather. father and grandfather are ancestors of brother and sister. brother and sister are children of father. brother are sister are leaf nodes. brother and sister are siblings. father is the root of the subtree that contains the father, brother, sister nodes.A singleinheritance class hierarchy can be represented by a tree.
Component  ++++     Container Button Canvas TextComponent   +++ +++      Panel ScrollPane Window TextArea Textfield  +++   Dialog Frame  FileDialogMovement from a parent node to its child and other descendants is done along a path.
The level of a node is the length of the path from the root to the node.
The depth of a tree is the maximum level of any node in the tree. The depth of a node is the length of its path from the node to the root.
A level 0 /  \ B C D level 1 \ \ E F level 2 / \ G H level 3 The depth of this tree is 3 (the longest path from the root to a node). The path length for node E is 2. The path length for node D is 1. The path for node H is: A>B>E>H.Definition:
A general tree is a set of nodes that is either empty, or has a designated node called the root from which descend zero or more subtrees. Each node is not an ancestor of itself, and each subtree itself satisfies the definition of a treee.Note that a tree is defined in a recursive fashion; consequently, most treeprocessing algorithms are recursive.
YouTube.com::CS 61B Lecture 23: Trees and Traversals
{TopOfPage} {Oracle.com::API Specification  Tutorial} {Derek Banas Videos} {Data Structure Visualizations}
A binary tree is a popularly used tree in which each node has at most two (2) subtrees. These subtrees are designated as the left and right, respectively. Either or both of these subtrees may be empty.
The following operations are performed on a binary tree:
create
 initialize an empty binary treeempty
 return TRUE if tree is empty; FALSE otherwiseinsert
 item (value) is added to the tree in a way that maintains the tree's hierarchical propertypreorder traversal
 each node of the tree is visited in the following order: root of the tree first, then recursively all nodes in left subtree, then recursively all the nodes in the right treeinorder traversal
 each node of the tree is visited in the following order: first visit recursively all nodes in the left subtree of the tree, then visit the root node, then recursively all nodes in the right treepostorder traversal
 each node of the tree is visited in the following order: first visit recursively all nodes in the left subtree, then recursively all nodes in the right subtree, then visit the root node[Definition] An algorithm that systematically "visits" all items in a tree is called a tree traversal.
Here is a binary tree and its respective (depthfirst) traversals:
A / \ B C / / \ D E F / \ G H Preorder traversal: A (root); B,D (left); C,E,G,H,F (right) Inorder traversal: D,B (left); A (root); G,E,H,C,F (right) Postorder traversal: D,B (left); G,H,E,F,C (right); A (root)The following tree takes on a heap property:
87 / \ / \ 84 63 / \ \ 68 79 12 / \ / \ 32 67 6 10 / \ 8 9 The data in any given node of the tree is greater than or equal to the data in its left and right subtrees.A binary tree can have a hierarchy that exhibits the property known as the ordering property.
87 / \ / \ 84 103 / \ / \ 68 86 90 109 / \ / \ 32 74 88 97 / \ 70 80 The data in each node of the tree are greater than all of the data in that node's left subtree and less than or equal to all the data in the right subtree.A binary tree with the ordering property is often called a binary search tree (BST).
A complete binary tree of depth
N
is a tree in which each level0
toN1
has a full set of nodes and all leaf nodes at levelN
occupy the leftmost positions in the tree.A Complete tree with depth 3. / \ B C / \ / \ D E F G / \ /\ H I J K Leaf nodes occupy leftmost positions.A full binary tree is a complete tree that contains 2^{N} nodes at level N.
A Full tree with depth 2. / \ B C Every node has two children. / \ / \ D E F G 2^{2} is 4 and there are 4 nodes.A degenerate tree is one that has a single leaf node and each nonleaf node has only one child. [Note: a degenerate tree is equivalent to a linkedlist.]
A \ B \ C / D \ EMany compilers make use of tree structures in obtaining forms of an arithmetic expression for evaluation.
Operators are nonleaf nodes and operands are leaf nodes. + / \  * / \ / \ A B C ÷ / \ E F (A  B) + C * (E / F) infix +  A B * C / E F prefix A B  C E F / * + postfixBinary Tree Structure
A binary tree structure is built with nodes. A tree node contains a data field and two pointer (i.e. link) fields.
Given the nonlinear structure of trees, a linkedlist type of searching (i.e. sequential) is not available; therefore, a variety of traversal methods (inorder, preorder, postorder) are commonly used.
Tree traversal methods rely heavily on recursion. Each traversal method performs three actions at a node:
 visit the node
 recursively descend the left subtree
 recursively descend the right subtree
The descent terminates when an empty tree pointer is encountered.
A node visit depends on the application (i.e. it is applicationdependent).
Inorder Traversal
An inorder traversal begins its action at a node by first descending to its left subtree. After recursively descending through the nodes in that subtree, the inorder traversal takes the second action at the node and uses the data value. The traversal completes its action at the node by performing a recursive scan of the right subtree. In the recursive descent through the subtrees, the actions of the algorithm are repeated at each new node.
 traverse the left subtree
 visit the node
 traverse the right subtree
Postorder Traversal
The postorder traversal delays a visit to a node until after a recursive descent of the left subtree and a recursive descent of the right subtree.
 traverse the left subtree
 traverse the right subtree
 visit the node
Preorder Traversal
The preorder traversal visits the node first, then it does recursive descents of the left and right subtrees, respectively.
 visit the node
 traverse the left subtree
 traverse the right subtree
Breadth First Traversal (or Level Traversal)
A breath first (or level) traversal scans the tree nodes levelbylevel starting with the root at level 0.
A common implementation of this algorithm is queuebased.
create queue Q insert root into Q while Q is not empty delete front node p from Q use node p to identify its children at next level if (p.left() != null) Q.insert(p.left()); if (p.right() != NULL) Q.insert(p.right()); Using the queue helps ensure that nodes on the same level are visited in the correct order (in this case lefttoright).BigO
Trees perform inserts, deletes, and finds in O(log N) time (on average). A degenerate tree can be O(N).
Exercises
The following inputs inserted in the given order produce a good binary tree.
NY IL GA RI MA PA DE IN VT TX OH WYWhat the binary tree look like? [answer]
What order do these state abbreviations need to be inserted in order to produce a degenerate tree? [answer]
WolframAlpha.com::Binary Tree
{TopOfPage} {Oracle.com::API Specification  Tutorial} {Derek Banas Videos} {Data Structure Visualizations}
A general binary tree can hold a large collection of data and yet provide fast access.Recall, creating a "container" or "collection" class using an array or linkedlist, requires those collections to be searched using a linear search. This can be inefficient when dealing with large amounts of data [O(N)].
Trees are efficient for searching because the path to any data value is no greater than the depth of the tree. Searching is maximized with a complete binary tree O(log_{2}N).
To store elements in a tree for efficient access, a binary search tree (BST) can be used.
For each node, the data values in the left subtree are less than the value of the node and the data values in the right subtree are greater than or equal to the value of the node.Using a BST implies that the data found in each node contain a key . In many cases, the data of a node is a record and the key is a field in the record.Operations performed on a BST are:
find  search a tree of a specific data value insert  find appropriate insertion spot and add new data value to the tree delete  search a tree and remove the first occurrence of a given data value; reattach all subtrees to maintain BST structure update  if key at current position matches key for the data item, update data value; otherwise, insert data value into the treeInsertion on a Binary Search Tree
Search the tree and find the location where the new node is to reside. This is done be scanning the left and right subtrees until the insertion point is found. For each step in the path, maintain a record of the current node and the parent of the current node. The process terminates when an empty subtree is encountered. At this location, the new node is inserted as a child of the parent.25 / \ 20 35 / \ 12 40 Insert the value 32 into this tree. Step 1: compare 32 with 25  traverse the right subtree Step 2: comapre 32 with 35  traverse the left subtree Step 3: insert 32 as left child of parent (i.e. 35). 25 / \ 20 35 / / \ 12 32 40BST Pseudocode For find(), min(), max(), delete()
BST is a collection of Node objects class BST class Node Node root int key int nodecnt Node parent Node left Node right find(key) if root is null return null return find(key, root) find(key, node) if key equals node.key return node if key less than node.key if node.left null return null return find(key, node.left) if node.right null return null return find(key, node.right) min(node) if node.left null return node return min(node.left) max(node) // left to the reader delete(key) node = find(key) if node null return decrement nodecnt // case 0: delete childless root node if node.parent null // or nodecnt equals 0 if node.left null and node.right null set root to null return // case 1: delete leaf node if node.left null and node.right null if node.parent !null if node.parent.left equals node set node.parent.left to null else set node.parent.right to null return // case 2: delete node having two kids if node.left !null and node.right !null minnode = min(node.right) set minnode.left to node.left set node.left.parent to minnode if minnode.right !null and minnode.parent not equals node set minnode.parent.left to minnode.right set minnode.right.parent to minnode.parent else if minnode equals minnode.parent.left set minnode.parent.left to null if minnode.parent not equal node set minmode.right to node.right set node.right.parent to minnode set minnode.parent to node.parent if node.parent !null if node.parent.left equals node set node.parent.left to minnode else set node.parent.right to minnode else set root = minnode set node.parent = null return // case 3: delete node having left child only if node.left !null if node.parent !null if node.parent.left equals node set node.parent.left to node.left else set node.parent.right to node.left set node.left.parent to node.parent return set root to node.left set root.parent to null return // case 4: delete node having right child only if node.parent !null) if node.parent.right equals node set node.parent.right to node.right else set node.parent.left to node.right set node.right.parent to node.parent return set root to node.right set root.parent to null returnUpdate on a Binary Search Tree
Find the node that needs updating and copy in new data. If node note not found, then insert it. What if key being modified? Find node, delete it and then insert new node containing the updated key.
{TopOfPage} {Oracle.com::API Specification  Tutorial} {Derek Banas Videos} {Data Structure Visualizations}
234 Trees are trees that have nodes which may have more than two children.
A benefit of a 234 tree is that it contains shorter search paths because there are fewer levels in the tree.
When searching a binary tree, you either go left or right at each node. In a 234 tree, there can be several different directions that a search path may follow.
Definition:
An mnode in a search tree stores m1 data values k_{1} < k_{2} < ... < k_{m1}, and has links to m subtrees T_{1}, ..., T_{m}, where for each value i, all data values in T_{i} < k_{i} <= all data values in T_{i+1}Here is an example 234 tree.
53 / \ + +   27 38 60 70 /  \ /  \ 16,25 33,36 41,46,48 55,59 65,68 73,75,79 This tree has 9 nodes. The root node contains one value. Both nodes at level 1 have two values. On level two, there are 4 nodes containing 2 values, and 2 nodes containing 3 values. Search for the value 36. 36 is less than 53, traverse the left subtree. 36 greater than 27 and less than 38, traverse the center subtree. 36 less than 33, compare to the next number. 36 equals 36  found.234 trees satisfy the following properties:
 each node stores at most 3 data values
 each internal node is a 2node, a 3node, or a 4node
 all the leaves are at the same level
In a nutshell, 234 tree is constructed as follows:
 begin with a single node and insert values into this node until it becomes full (i.e. contains three values)
 when the next value is inserted, split the node into two nodes, one storing the values less than the median and other storing values greater than the median
 the median value is stored in the parent having these two nodes as children
Construct a 234 tree using the inputs: 10 3 9 15 21 44 18 1 12 5 7 insert 10: 10 insert 3: 3,10 insert 9: 3,9,10 insert 15: split 9 3 10 15 greater than 9  search right subtree 9 3 10,15 insert 21: 9 3 10,15,21 insert 44: 44 greater than 9  search right subtree internal node is full  split median value moved into parent 9,15 3 10 21,44 insert 18: 18 greater than 15  search right subtree 9,15 3 10 18,21,44 insert 1: 1 less than 9  search left subtree 9,15 1,3 10 18,21,44 insert 12: 12 greater than 9 and less than 15  search middle subtree 9,15 1,3 10,12 18,21,44 insert 5: 5 less than 9  search left subtree 9,15 1,3,5 10,12 18,21,44 insert 7: 7 less than 9  search left subtree internal node is full; split median value 3 moved into parent 3,9,15 1,5,7 10,12 18,21,44234 trees stay balanced during insertions and deletions.
Think about topdown insertion versus the demonstrated bottomup insertion. [Don't allow any parent nodes to become 4nodes.]
234 trees have numerous unused pointers. Explore RedBlack trees for an alternative.
234 tree with N nodes has 4*N pointers each node (excluding the root) has only one pointer to it 4*N  (N1) = 3N+1 of the pointers are null (i.e. approximately 75% of the pointers are null)Example 234 tree node class.
class Node234 { Object data[3]; Node234 links[4]; ... };azrael.digipen.edu::Deleting Elements From a 234 Tree
YouTube.com::UCBerkeley::CS61B::Lecture 26: Balanced Search Trees
{TopOfPage} {Oracle.com::API Specification  Tutorial} {Derek Banas Videos} {Data Structure Visualizations}
The runtime efficiency of a program has two main ingredients: space and time.
Space efficiency is a measure of the amount of memory a program requires.
Time efficiency is a measure of how long a program takes to execute.
In general, there is a tradeoff between space and time. If your algorithm uses more space, then typically it will execute faster and vice versa.
The relationship between
N
(whereN
is a variable representing the number of inputs on which your algorithm must operate) and the running time of an algorithm asN
becomes large is called the computational complexity of that algorithm.Computational complexity is denoted using BigO notation. The letter
O
stands for the word order as it is used in the phrase on the order of. It signifies approximation.Given two functions f and g, the function g is said to dominate function f if there is some positive constant c such that:
c * g(x) >= f(x), for all possible values of xAsymptotic dominance is a variation of dominance such that:
c * g(x) >= f(x), for all x >= x'In other words, function g asymptotically dominates function f for all "large" values of x. BigO notation is represented using asymptotic dominance.
5N versus N^{2}  N=1: 5 versus 1 N=2: 10 versus 4 N=3: 15 versus 9 N=4: 20 versus 16 N=5: 25 versus 25 N=6: 30 versus 36 N=7: 35 versus 49 In this example, x' is equal to 6 and the constant is 1.The following are some examples of how dominance is used to determine BigO values:
O(N^{3}) + O(N^{2}) => O(N^{3}) As N gets large, N cubed grows at a faster rate than does N squared. For large values of N, N cubed dominates. O(10 * N) + O(1000 * N) + O(1) => O(N) Constant values are ignored. The significance of the constant values are minimized for large values of N. O(N log N) + O(log N) + O(N^{2}) => O(N^{2}) N squared dominates N log N which in turn dominates log N.Let's review:
 BigO notation is used to approximate (i.e. estimate) the running time of an algorithm; it is used to determine the most dominate term in a function; BigO represents the growth rate of a function
 running time is expressed in relative speed, not absolute speed; constants are dropped because it is not meaningful across different machines
 BigO analysis is used only when dealing with algorithms that implement loops and/or recursion
 two algorithms that are both
O(N)
may have different running times for the same number of inputs BigO measures are based on asymptotic dominance
BigOnotation is used for three distinct purposes.
 To bound the error that is made when we ignore small terms in mathematical formulas.
 To bound the error that is made when we ignore parts of a program that contribute a small amount to the total being analyzed.
 To allow us to classify algorithms according to upper bounds on their total running times.
Categories of Common Running Times O(1) constant time (most efficient) O(N) linear time (polynomial time) O(N log N) "linearithmic" time (polynomial time) O(N^{2}) quadratic time (polynomial time) O(N^{3}) cubic time (polynomial time) O(log N) logarithmic time O(2^{N}) exponential time (impractical for large values of N) The following table presents growth rates for some of the common functions (i.e running times):
N log N N N log N N**2 N**3 2**N ========================================================================= 1 0 1 0 1 1 2 2 1 2 2 4 8 4 4 2 4 8 16 64 16 8 3 8 24 64 512 256 16 4 16 64 256 4096 65536 32 5 32 160 1024 32768 4.29497e+09 64 6 64 384 4096 262144 1.84467e+19 128 7 128 896 16384 2097152 3.40282e+38Typically constant factors are ignored in BigO analysis (although in reality constants frequently matter).
To compare the speed of different algorithms, computer scientists use bigO notation. The bigO notation doesn't exactly predict performance times. Instead, what the bigO notation tells us is that given two algorithms with the same order, then both will slow down at roughly the same rate as the number of items being sorted gets larger.
Search algorithms also have bigO values. Linear search is
O(n)
, binary search isO(log n)
. Once again, these differences are most important for large data sets. If you have 1 million items in an array, a linear search will require about 1 million tests, but a binary search will only require 20 tests (because 2 to the 20 power is about 1 million  the 'log' in bigO notation is always logbase2  when you have a logarithmic algorithm, the analysis need not be concerned with the base of the logarithm, since this can change the total running time only by a constant factor (and constant factors are ignored).for (i = SOME_CONSTANT; i <= ANOTHER_CONSTANT; i++) // O(1) // loop body requiring constant time for (i = SOME_CONSTANT; i <= N; i++) // O(N) // loop body requiring constant time for (i = SOME_CONSTANT; i <= N; i++) // O(N*N) for (j = ANOTHER_CONSTANT; j <= N; j++) // loop body requiring constant time for (i = SOME_CONSTANT; i <= N; i++) // O(N*M) // loop body requiring time O(M), where M is a variableThe following is example of how one can go about determining the approximate running time of an algorithm.
// the following logic initializes a 'n' element array index = 0; // assignment is 1 step while (index < n) { // EXPR index < n is 1 step array[index] = 1; // assignment is 1 step index++; // increment is 1 step } Si + Sc*(n+1) + Sb*n n = 0: 1 + 1*1 + 2*0 = 2 n = 1; 1 * 1*2 + 2*1 = 5 n = 2; 1 * 1*3 + 2*2 = 8 n = 3: 1 + 1*4 + 2*3 = 11 n = 6: 1 + 1*7 + 2*6 = 20 n = 9: 1 + 1*10 + 2*9 = 29 n = 300: 1 + 1*301 + 300*2 = 902 Si => # of initialization steps Sc => # of steps in the condition EXPR sb => # of steps in the loop body Conclusion: linear  O(n)Analysis of a Couple of Code Snippets
for (k = 1; k <= n / 2; k++) { ... for (j = 1; j <= n * n; j++) { ... } }Since these loops are nested, the number of times statements within the innermost loop are executed is the product of the number of repetitions of the two individual loops. Hence the efficiency is
n^{3}
, orO(N^{3})
in BigO terms, with a constant of proportionality equal to 1/2.for (k = 1; k <= n * 10; k++) { ... } for (j = 1; j <= n * n * n; j++) { ... }Since one loop follows the other, the number of operations executed by both loops is the sum of the individual loop efficencies. Hence the efficiency is
10n + n^{3}
, orO(N^{3})
in BigO terms.Analysis of Exchange Sort
Assume we have array
A
with lengthN
and we want to find the minimum element in the array.Pass 1: compare the n1 elements A[1]...A[n1] with A[0] and, if necessary, exchange elements so that A[0] always has the smallest value Pass 2: compare the n2 elements A[2]...A[n1] with A[1] and, if necessary, exchange elements so that A[1] less than all elements to the right Pass i: compare the ni elements A[i]...A[n1] with A[i1] and, if necessary, exchange elements The total number of comparisons is given by the arithmetic series f(n) from 1 to n1: f(n) = (n1) + (n2) + ... + 3 + 2 + 1 = n * (n1) / 2 The number of comparisons depends on n^{2}.Using an Array for a List
Append an item: O(1). Insert an item at the front of the array: O(N).
Simple Sorts
Bubble, selection, insertion, and exchange are all O(N^{2}).
Shell sort can give much better performance than O(N^{2}) in the worst case.
Quick sort in the worst case can be O(N^{2}), but on average it is O(N log N).
Merge sort is an O(N log N) sort.
Order of a Function: Mathematical Definition
The order of a function is defined as follows:
Given two nonnegative functions f and g, the order of f is g if and only if g asymptotially dominates f. Using BigO notation, this can said as:f = O(g)
(f is of order g).Rules for Comparing BigO Estimates
Some rules to help calculate and compare BigO estimates:
Let f and g be functions, and C a constant: O(C * f) = O(f) O(f * g) = O(f) * O(g) O(f / g) = O(f) / O(g) O(f) > O(g), if and only if f dominates g O(f + g) = Max[O(f), O(g)]About Logarithms
A logarithm of base b for the value y is the power to which you must raise b to get y. Normally, this is written as:
log_{b}y = x
.log_{b}y = x <=> b^{x} = y <=> b^{logby} = y where <=> means "is equivalent to"Logarithms have the following properties:
For any positive values of m, n, and r, and any positive integers a and b: log nm = log n + log m log n/m = log n  log m log n^{r} = r log n log_{a}n = log_{b} n / log_{b}aYouTube.com::Berkeley.edu::CS61B::Lecture 19: Asymptotic Analysis
{TopOfPage} {Oracle.com::API Specification  Tutorial} {Derek Banas Videos} {Data Structure Visualizations}
Home  Previous  Next 
