Tip:
Highlight text to annotate it
X
In our previous lesson, we introduced you to tree data structure. We discussed tree
as a logical model and talked briefly about some of the applications of tree. Now, in
this lesson we will talk a little bit more about binary trees. As we had seen in our
previous lesson, binary tree is a tree with this property that each node in the tree can
have at most 2 children. We will fist talk about some general properties of binary tree
and then we can discuss some special kind of binary trees like binary search tree which
is a really efficient structure for storing ordered data. In a binary tree as we were
saying, each node can have at most 2 children. In this tree that I have drawn here, nodes
have either 0 or 2 children. We could have a node with just one child. I have added one
more node here and now we have a node with just one child. Because each node in a binary
tree can have at most 2 children, we call one of the children left child and another
right child. For the root node, this particular node is left child and this one is right child.
A node may have both left and right child. These 4 nodes have both left and right child
or a node can have either of left and right child. This one has got a left child, but
has not got right child. I'll add one more node here. Now this node has a right child,
but does not have a left child. In a program, we would set the reference or pointer to left
child as NULL. So, we can say that for this node left child is NULL and similarly for
this node, we can say that the right child is NULL. For all the other nodes that do not
have children, that are leaf nodes, a node with 0 child is called leaf node. For all
these nodes, we can say that both left and right child are NULL. Based on properties,
we classify binary trees into different types. I'll draw some more binary trees here. If
a tree has just one node, then also its a binary tree. This structure is also a binary
tree. This is also a binary tree. Remember, the only condition is that a node cannot have
more than 2 children. A binary tree is called strict binary tree or proper binary tree if
each node can have either 2 or 0 children. This tree that I am showing here is not a
strict binary tree because we have 2 nodes that have one child. I'll get rid of 2 nodes
and now this is a strict binary tree. We call a binary tree complete binary tree if all
levels except possibly the last level are completely filled and all nodes are as far
left as possible. All levels except possibly the last level will anyway be filled. So,
the nodes at the last level, if its not filled completely must be as far left as possible.
Right now this tree is not a complete binary tree. Nodes at same depth can be called nodes
at same level. Root node in a tree has depth 0. Depth of a node is defined as length of
path from root to that nodes. In this figure lets say nodes at depth 0 are nodes at level
0. I can simply say L-0 for level 0. Now, these two nodes are at level 1, these 4 nodes
are at level 2 and finally these 2 nodes are at level 3. The maximum depth of any node
in the tree is 3. Maximum depth of a tree is also equal to height of the tree. if we
will go numbering all the levels in the tree like L-0, L-1, L-2 and so on, then the maximum
number of nodes that we can have at some level i, will be equal to 2 to the power i. At level
0, we can have 1 node, 2 to the power 0 is 1. Then at level 1, we can have at max 2 nodes.
At level 2 , we can have 2 to the power 2 nodes at max which is 4. So, in general at
any level i, we can have at max 2 to the power i nodes. You should be able to see this very
clearly. Because each node can have 2 children, so if we have x nodes at a level, then each
of these x nodes can have 2 children. So, at next level, we can have at most 2x children.
Here in this binary tree, we have 4 nodes at level 2 which is the maximum for level
2. Now, each of these nodes can possibly have 2 children. I am just drawing the arrows here.
So, at level 3, we can have max 2 times 4 i.e 8 nodes. Now, for a complete binary tree,
all the levels have to be completely filled. We can give exception to the last level or
deepest level. It doesn't have to be full. But the nodes have to be as left as possible.
This particular tree that I am showing here is not a complete binary tree because we have
2 vacant node positions in left here. I'll do slight change in this structure. Now this
is a complete binary tree. We can have more nodes at level 3, but there should not be
a vacant position left. I have added one more node here and this still is a complete binary
tree. if all the levels are completely filled, such a binary tree can also be called perfect
binary tree. In a perfect binary tree, all levels will be completely filled. If h is
the height of a perfect binary tree, remember height of a binary tree is length of longest
path between root to any of the leaf nodes or i should say number of edges in longest
path from root to any of the leaf nodes. Height of a binary tree will also be equal to max
depth, Here, for this binary tree, max depth is 3. Maximum number of nodes in a tree with
height h will be equal to, we will have 2 to the power 0 nodes at level 0, 2 to the
power 1 nodes at level 1 and we'll go on summing for height h, we will go till 2 to the power
h. At deepest level, we will have 2 to the power h nodes. Now this will be equal to 2
to the power h plus 1 minus 1. h+1 is number of levels here. We can say that 2 to the power
number of levels minus 1. In this tree, number of levels is 4. We have L0 till L3. So, number
of nodes, maximum number of nodes will be 2 to the power 4 minus 1 which is 15. So,
a perfect binary tree will have maximum number of nodes possible for a height because all
levels will be completely filled. Well, I should say maximum number of nodes in a binary
tree with height h. Ok, I can ask you this also. What will be height of a perfect binary
tree with N nodes. Lets say N is number of nodes in a perfect binary tree. To find out
height, we'll have to solve this equation n = 2^h+1 - 1 because if height is h, number
of nodes will be 2 to the power (h+1) minus 1. We can solve this equation and the result
will be this. Remember n is number of nodes here. I'll leave the maths for you to understand.
Height will be equal to log (n+1) to the base 2 minus 1. In this perfect binary tree that
I am showing here, number of nodes is 15. So, n is 15. (n+1) will be 16. So, h will
be log 16 to the base 2 minus 1. log 16 to the base 2 will be 4. So, the final value
will be 4-1 equal 3. In general, for a complete binary tree, we can also calculate height
as floor of log n to the base 2. So, we need to take integral part of log n to the base
2. Perfect binary tree is also a complete binary tree. Here n is 15. log of 15 to base
2 is 3.906891. if we'll take the integral part, then this will be 3. I'll not go into
proof of how height of complete binary tree will be log n to the base 2. We'll try to
see that later. All this maths will be really helpful when we will analyze cost of various
operations on binary tree. Cost of a lot of operations on tree in terms of time depends
upon the height of tree. For example, in binary search tree which is a special kind of binary
tree, the cost of searching, inserting or removing an element in terms of time is proportional
to the height of tree. So, in such case we would want the height of the tree to be less.
Height of a tree will be less if the tree will be dense, If the tree will be close to
a perfect binary tree or a complete binary tree. Minimum height of a tree with n nodes
can be log n to the base 2 when the tree will be a complete binary tree. if we will have
an arrangement like this, then the tree will have maximum height. With n nodes, minimum
height possible is floor of or integral part of log n to the base 2 and maximum height
possible with n nodes in n-1 when we will have a sparse tree like this which is as good
as a linked list. Now, think about this. If I'm saying that time taken for an operation
is proportional to height of the tree or in other words I can say that if time complexity
of an operation is big-oh of h where h is height of the binary tree, then for a complete
or perfect binary tree, my time complexity will be big-oh of log n to the base 2 and
in worst case for this sparse tree, my time complexity will be big-oh of n. Order of log
n is almost best running time possible. For n as high as 2 to the power 100, log n to
the base 2 is just 100. With order of n running time, if n will be 2 to the power 100, we
won't be able to finish our computation in years even with most powerful machines ever
made. So, here is the thing. Quite often, we want to keep the height of a binary tree
minimum possible or most commonly, we say that we try to keep a binary tree balanced.
We call a binary tree balanced binary tree, if for each node, the difference between height
of left and right sub-tree is not more than some number k. Mostly k would be 1. So, we
can say that for each node, difference between height of left and right sub-tree should not
be more than 1. There is something that I want to talk about height of a tree. We had
defined height earlier as number of edges in longest path from root to a leaf. Height
of a tree with just one node where the node itself will be a leaf node will be 0. We can
define an empty tree as a tree with no node and we can say that height of an empty tree
is -1. So, height of tree with just one node is 0 and height of an empty tree is -1. Quite
often, people calculate height as number of nodes in longest path from root to a leaf.
In this figure, I have drawn one of the longest paths from root to a leaf. We have 3 edges
in this path. So, the height is 3. If we will count number of nodes in the path, height
will be 4. This looks very intuitive and I have seen this definition of height at lot
of places. if we will count the nodes, height of tree with just one node will be equal to
1 and then we can say that height of an empty tree will be 0, but this is not the correct
definition and we are not going to use this assumption. We are going to say that height
of an empty tree is -1 and height of tree with one node is 0. The difference between
heights of left and right sub-trees of a node can be calculated as absolute value of height
of left subtree minus height of right subtree and in this calculation, height of a sub-tree
can be -1 also. For this leaf node here in this figure, both left and right sub-trees
are empty, so both hleft or height of left sub-tree and hright or height of right sub-tree
will be -1, but the difference overall will be 0. For all nodes in a perfect tree, difference
will be 0. I have got rid of some nodes in this tree and now by the side of each node,
I have written the value of diff. This is still a balanced binary tree because the maximum
diff for any node is 1. Lets get rid of some more nodes in this tree and now this is not
balanced because one of the nodes has diff 2. For this particular node, height of left
sub-tree is 1 and height of right sub-tree is -1 because right sub-tree is empty. So,
the absolute value of difference is 2. We try to keep a tree balanced to make sure its
dense and its height is minimized. If height is minimized, cost of various operations that
depend upon height are minimized. Ok, the next thing that I want to talk about very
briefly is how we can store binary trees in memory. One of the ways that we had seen in
our previous lesson which is most commonly used is dynamically created nodes linked to
each other using pointers or references. For a binary tree of integers, in C or C++, we
can define node like this - data type here is integer, so we have a field to store data
and we have two pointer variables, one to store address of left child and another to
store address of right child. This of course is the most common way. Nodes dynamically
created at random locations in memory linked together through pointers, but in some special
cases, we use arrays also. Arrays are typically used for complete binary trees. I have drawn
a perfect binary tree here. Lets say this is a tree of integers. What we can do is,
we can number these nodes from 0 starting at root and going level by level from left
to right. So, we will go like 0, 1 , 2 , 3 , 4, 5 and 6. Now, I can create an array of
7 integers and these numbers can used as indices for these nodes. So, at 0th position, I'll
fill 2, at 1th position I'll fill 4, at 2th position, we will have 1 and I'll go on like
this. We have filled in all the data in the array, but how will we store that information
about the links. How will we know that the left child of root has value 4 and the right
child of root has value 1? Well, in case of complete binary tree, if we will number the
nodes like this that for a node at index i, the index of left child will be 2*i+1 and
the index of right child will be 2i+2 and remember this is true only for a complete
binary tree. For 0, left child is - 2i+1 for i = 0 will be 1 and 2i+2 will be 2. Now, for
1, left child is at index 3, right child is at index 4. For i =2, 2i+1 will be 5 and 2i+2
will be 6. We will discuss array implementation in detail when we will talk about a special
kind of binary tree called heap. Arrays are used to implement heaps. I'll stop here now.
In our next lesson, we will talk about binary search tree which is also a special kind of
binary tree that gives us a really efficient storing structure in which we can search something
quickly as well as update it quickly. This is it for this lesson. Thanks for watching
!!