Section 5.3: Decision Trees
Abstract:
Decision trees are defined, and some examples given. Binary search trees store
data conveniently for searching later. Some bounds on worst case scenarios are
established.
Definition: a decision tree is a tree in which
-
internal nodes represent actions,
-
arcs represent outcomes of an action, and
-
leaves represent final outcomes.
-
Figure 5.50/5.51, p. 380/387: Results of tossing a coin 5 times, no two heads
in a row
(binary decision tree)
-
Figure 5.51/5.52, p. 381/388: Sequential Search on 5 elements (binary tree)
-
Figure 5.52/5.53, p. 382/389: Binary Search on a sorted list (ternary tree, although it
appears binary since those leaves corresponding to equality have been
suppressed)
-
Figure 5.55/5.56, p. 387/393: Sorting a list (binary tree, provided distinct list
elements)
Exercise #2, p. 388/394 (how do we modify Figure 5.51/5.52?).
Practice 24, p. 383/390.
In particular about binary trees:
-
Any binary tree of depth d has at most nodes. (Proof: look at the
full binary tree, as it has the most nodes per depth.)
-
Any binary tree with m nodes has depth , where
is the floor function, meaning the greatest integer
less than or equal to x. Again, the proof can be motivated simply by looking
at the full binary tree situation:
Table: Adding one more node bumps the depth up 1, so that if there are
nodes, the depth is d. Hence, in the case of powers of 2, .
A more formal proof is by contradiction and interesting (p. 384/390):
-
Assume : then
.
-
From fact 1 (above the table), .
By contradiction, .
These facts lead to the following
Theorem (on the lower bound for searching):
Any algorithm that solves the search problem for an n-element list by
comparing the target element x to the list items must do at least
comparisons in the worst case.
If, in its worst case, an algorithm does at most this lower bound on worst case
behavior is an optimal algorithm in its worst-case behavior. Binary
search is optimal (based on Practice 24!).
The Binary search algorithm required a sorted list; if your data is unsorted
(it may be changing dynamically in time, if you are updating a database of
customers, for example), you can populate a tree which
approximates a sorted list, and then use a modified search algorithm
(binary tree search) to search the list. A binary search tree is
constructed as follows:
-
The first item in the list is the root;
-
Successive items are inserted by comparing them to existing nodes, from the
root node: if less than a node, descend to the left child and iterate; if
greater than, descend to the right child.
-
If, in descending, there is no child, you create a new node.
For example, Figure 5.54/5.55:
Practice #25, p. 386/392.
The binary tree search algorithm works in the same way as you'd introduce a new
node, only the algorithm terminates if
-
the element is equal to a node, or
-
the element is unequal to a leaf of the binary search tree.
In this case the binary search tree serves as the decision tree for the binary
tree search algorithm.
Exercise #9, p. 389/395.
Examine Figure 5.55/5.56, p. 387/393:
In this case, we're sorting a three-element list using a decision tree. The
author calls this a stupid algorithm (actually, ``not particularly astute''):
why?
(Practice #26, p. 387/393. How would we modify Figure 5.55/5.56?)
Assuming no equal elements in the list, then this is indeed a binary (rather
than ternary tree, with = included). In this case, we can also get a lower
bound on sorting a list with n elements:
-
There are n! possible sorted lists, and there are at least that many leaves
p ( ). (In Figure 5.55/5.56, there are eight leaves, but only 6=3!
different sorted lists).
-
A worst-case final outcome in the decision tree is given by the depth d of
the tree.
-
Since the tree is binary, (the maximum number of leaves possible at
depth d).
-
Taking logs (base 2), we get , or ,
where is the ceiling function, which yields the
smallest integer greater than or equal to x.
-
Hence, .
This is the Theorem on the lower bound for sorting: that you have to go to at
least a depth of in the worst case.
LONG ANDREW E
Fri Oct 25 10:31:07 EDT 2002