> import Data.List hiding (insert,delete)
Let's review binary trees with a simple, natural binary tree. We're
just using integer keys, so this tree can only be used for testing membership.
We allow empty trees, otherwise nodes.
> data Tree = N Int Tree Tree | E deriving (Show,Eq)
Let's define a "shorthand" for the empty tree.
> empty = E
Here is an accessor for the key.
> key (N m _ _) = m
The tree is supposed to preserve the property that the mark at any
node is bigger than the marks in the left subtree and smaller than the
marks in the right subtree. We might check this during node creation
and define a custom constructor.
> node m E E = N m E E
> node m l E =
> if m < key l then undefined else
> N m l E
> node m E r =
> if m > key r then undefined else
> N m E r
> node m l r =
> if m < key l then undefined else
> if m > key r then undefined else
> N m l r
Let's compute basic properties of the tree. The size is the total
number of nodes.
> size E = 0
> size (N _ l r) = 1 + size l + size r
The height is the maximum pathlength from root to a leaf.
> height E = 0
> height (N _ l r) = 1 + max (height l) (height r)
To check wether a node is a member, we "walk down the tree". There
are four cases: the node is empty, the node contains the target, or
otherwise the target is less or greater than the mark in the node.
> member E x = False
> member (N k l r) x =
> if x==k then True
> else if x else member r x
Note that this goes down either the left branch or the right branch.
That means that it takes at most as many steps as the height of the
tree.
Insertion in a natural binary tree is quite analogous to membership.
In balanced binary trees, it becomes considerably more complicated.
> insert E x = N x E E
> insert n@(N k l r) x =
> if x==k then n
> else if x else node k l (insert r x)
The first few cases for deletion are not that hard. If the tree
is empty, we have an error.
> delete E x = undefined
If there is only a left or right subbranch, we simply replace
the current node with the subbranch.
> delete (N k l E) x | x==k = l
> delete (N k E r) x | x==k = r
If the key is not in the current node, we just delete recursively
either in the left branch or the right branch, analogous to insertion.
> delete (N k l r) x | x/=k =
> if x else node k l (delete r x)
The hard case is when the current node contains both left and right
subbranches. When we delete the current mark, we need to put something
there that preserves the overall structure of the tree.
A simple idea is to take the minimum of the right branch, remove that
from the right branch, and put it in the current node.
> delete (N k l r) x | x==k =
> let (m,r') = delmin r in
> N m l r'
Fortunately, removing the minimum from a tree is a fairly easy operation.
> delmin (N k E r) = (k,r)
> delmin (N k l r) =
> let (m,l') = delmin l in
> (m,N k l' r)
Now, with all these definitions, we can also sort again:
> enqueue x t = insert t x
> dequeue E = Nothing
> dequeue t = Just (delmin t)
> tree_sort l =
> let pq = foldr enqueue empty l in
> unfoldr dequeue pq
> l = [11,9,2,7,1,8,4,3,5,6,10]
> t = foldr enqueue empty l
> s = tree_sort l
With all these operations, it's not clear that our tree is correct. It is therefore
useful to write some functions to verify that the tree is valid, which mainly means
verifying the order relationships between the marks and the subtrees.
> tmin (N m E E) = m
> tmin (N m l E) = min m (tmin l)
> tmin (N m E r) = min m (tmin r)
> tmin (N m l r) = min m (min (tmin l) (tmin r))
> tmax (N m E E) = m
> tmax (N m l E) = min m (tmax l)
> tmax (N m E r) = min m (tmax r)
> tmax (N m l r) = min m (min (tmax l) (tmax r))
> check True = 0
> check False = undefined
> validate E = 0
> validate (N m E E) = 1
> validate (N m l E) = 1 + check ((tmax l) < m) + validate l
> validate (N m E r) = 1 + check ((tmin r) > m) + validate r
> validate (N m l r) = 1 + check ((tmax l) < m) + check ((tmin r) > m) + validate l + validate r
Here is a simple test; something like this is called a "unit test".
> t' = validate (foldl delete t [11,3,8,4])
Incidentally, it is also important to verify that our validation
function itself actually detects errors; we need to "unit test" the
test code.
> -- validate (N 2 (N 3 E E) E)