Verbal description of implementing BSTs and AVL trees

| 2012. 11. 26. 19:40

당연히 과제용으로 작성한 레포트를 그대로 복사 붙여넣기한 것인데, 인터넷에 아무리 찾아봐도 구체적인 설명이 나와 있는 곳이 없었던 것 같아서 혹시나 이런 간단한 자료가 도움이 될까 싶어 포스팅해둔다.
직접 코딩한 알고리즘을 공개해도 크게 상관은 없겠지만 모교 전자과 DS 수업에 어떤 파급 효과를 일으킬지 모르는 일이라 일단은 삼간다.
안 그래도 장황하게 늘어쓴, 이 극도로 생경한 코드를 인터넷에 공개적으로 배포하고 싶지는 않다는 것이 나의 심정.
하지만 혹시나 하는 마음에 ㅡ 이 포스트의 조회수가 과연 얼마나 될 것인지에 대해서는 일말의 고려도 하지 않은 이기적인 마음이다 ㅡ 개인적인 부탁이 들어온다면 그깟 하찮은 코드, 아낌없이 뿌린다.

간추린 언어의 미학. 내 코드에는 절대로 존재하지 않는 그 위대함.

트리의 구조체는 각자의 입맛에 맞게, 또는 각자의 용도에 맞게 짜면 될 듯하고 내부적으로 쓰이는 함수들은 실제로 사용하는 종류가 그다지 많지 않을 뿐더러 아래로 이어지는 설명을 꼼꼼히 읽는다면 ㅡ 과연 누가? ㅡ 어떤 함수를 어떤 식으로 구현해야 할지 감이 올 듯 하다.

물론 다시 한 번 강조하지만, 코드는 언제든 공유될 수 있다.

마지막으로, 레포트를 영어로 써야 하는지, 한글로 써야 하는지에 대한 조건은 없었던 걸로 기억하지만 원래 나는 모든 레포트를 영문으로 작성하고 보통 내가 쓴 글을 다시 읽어보는 습관이 없기 때문에 아마 아래의 내용이 내 의지에 의해 한글로 번역되는 일은 없을 것.

(1) Inorder traversal

The first node of inorder traversal should always be the leftmost child of the root. For all traversals, I defined a variable called “int lastmove”, which indicates the last move of traversal (0 means it was going downward, 1 means it was going up from the right child, and 2 means it was going up from the left child). I’ll explain the condition algorithm with the verbal description rather than just indicating the value of “int lastmove”.

When looking for the next node of inorder traversal, we need to divide the case into two sub-cases; when the current node is internal and when it is external. Note that the usage of visit and move. When we visit the node, it means that the function returns the node itself, while moving to the node means that we change the object node and repeat the whole sequence again.

When it is external, if it is the left child, then visit its parent; if it is the right child, move to its parent; if it has no parent, which means that the node itself is the root, return NULL.

When it is internal, if the sequence was going down or going up from the left child, the two cases can be dealt with the same way. If it has the right child, visit the leftmost child of the right child; if it doesn’t have the right child and is the root, return NULL; else (no right child, not the root) if it is the left child, visit its parent; if it is the right child, move to its parent. If the sequence was going up from the right child, we don’t need to consider its right child case, so if it is the root, return NULL; else if it is the left child, visit its parent; if it is the right child, move to its parent.

(2) Preorder traversal

The first node of preorder traversal should always be the root of the tree. We also divide the case into two sub-cases when we look for the next node of preorder traversal.

When the node is external, it means that the node is already visited, so we just move to its parent. We return NULL when that external node is the root.

When it is internal, if the sequence was going down, if it has the left child, we visit the left child; if it has the right child instead, we visit the right child. If the sequence was going up from the right child, if it has the right child, visit the right child; if it doesn’t have the right child and is the root, return NULL; if it doesn’t have the right child and isn’t the root, we simply move to its parent. If the sequence was going up from the right child, the algorithm is almost the same with the case of going up from the left child, except that in the going-up-from-the-right-child case we don’t need to consider the right child case.

(3) Postorder traversal

The first node of postorder traversal should always be the leftmost child of the root. But if that leftmost child has the right child, we should repeat the procedure until the node is external. When looking for the next node of postorder traversal, we also divide the case into two sub-cases.

When the node is external, if it is the left child, we check if there’s a sibling (the right child of its parent) and if there is, visit the node which is found by using the same sequence when we decide the first node of postorder traversal from the root; if there isn’t, visit its parent. If it is the right child, we don’t need to check its sibling, so just visit the parent. If it is the root, return NULL.

When the node is internal, if the sequence was going down, visit the node which is found by using the same sequence when we decide the first node of postorder traversal from the root. If the sequence was going up from either the left or the right, the procedure is same. If it is the root, return NULL. If it is the left child, check if there’s a sibling and if there is, visit the leftmost child of the sibling; if there isn’t, visit the parent. If it is the right child, because we don’t need to think about the right child case, we simply visit the parent.

(4) Height, level and reversing

If we want to find the height of the node, it is better to adopt it with recursive method. First, we calculate the height of the left child (size_t heightLeft) and the height of the right child (size_t heightRight) by calling the function recursively. And if the height of the left child is bigger than the right one, return heightLeft + 1, else return heightRight +1 because the height of the parent should be larger than the larger height of its children by one.

When we find the level of the node, we start from the object node and go up to the root while counting the number of jumping.

To reverse the tree, adopting recursive method is effective because we have to change the left child and the right child of every node. The algorithm is simple; first we change the left child and the right child of the object node and call the same function for its left child and right child.

(5) BST insertion

If the tree is empty, push the object node as the root. If the tree is not empty, we decide the destination of the node by comparing the key value of the nodes. If the key value of the object node is equal to or bigger than the existing node, we move to the right child of the existing node and compare again. If the key value of the object node is smaller than the existing node, we move to the left child of the existing node and compare again. If there’s no left child or right child, push the object node into the empty spot.

(6) Swapping tree nodes

Implementing the swap function is really helpful when we deal with BST removal. Basically, we make pointers pointing at the parent, the left child and the right child of two object nodes, and change each value with its corresponding value. In the case that the two object nodes are next to each other, the changed pointer values can point at the node itself rather than the desired nodes, so finally we fix those cases by assigning the pointer value to the desired nodes by force, which can be seen as somewhat redundant but actually necessary.

(7) BST removal

Because there are a number of ways to remove the node from BST, the manual gave us the constraints; when the node is removed, replace it with the rightmost child of the left child first, if it doesn’t have the left child, replace it with the leftmost child of the right child, remove it, and return the node of its parent. Because replacing sequence requires us to repeat the procedure, we use the while loop.

When the object node is external, if it is the root, initiate the tree again; if it is not, remove the node and return the parent by pointing at the parent node before removing it. If the object node is internal, if it is the degree-2 node or the degree-1 node which has the left child, we swap the object node with the rightmost child of the left child and move the pointer to the swapped node; if it is the degree-1 node containing the right child, swap the object node with the leftmost child of the right child and move the pointer to the swapped node. In all three cases, if the root has to be altered, we assign the new root pointer to the tree.

(8) BST search

Because BST has a distinctive property in its structure, search algorithm is simple. By comparing the key values, we decide whether we go to the left child or the right child. If the key values match, we return the current node. If it reaches to NULL, we return NULL.

(9) Rotations

There are basically two rotations when we deal with AVL. When we rotate to the right, we first make pointers pointing at the grandparent (tree_elem *grandparent), the parent (tree_elem *parent) and the right child (tree_elem *child) of the object node. If the object node has the grandparent, make the object node the child of the grandparent according to the location of the parent. After that, make the parent the right child of the object node and make the child the left child of the parent.

In the left rotation case, we simply mirror the sequence of the right rotation by changing the left node to the right node.

(10) LL, RR, LR, RL cases

Before implementing actual rotating procedures, it is important to find the node whose balance is broken. The special function “struct tree_elem* avl_node()” is used in order to find the unbalanced node. Starting with the given node, as going up to the parent, it compares the height of its children. If the difference in height is either 2 or -2, it returns that node.

In the LL case, first we find the unbalanced node and do the right rotation after designating the object node as the left child of the unbalanced node. In the RR case, we do the left rotation after designating the object node as the right child of the unbalanced node. In the LR case, first we designate the object node as the right child of the left child of the unbalanced node because the rotation should happen two times. After that, do the left rotation first and the right rotation later. In the RL case, the sequence is opposite; after designating the object node as the left child of the right child of the unbalanced node, rotate to the right first and then rotate to the left.

(11) AVL insertion

First, insert the node by using BST insertion method and find the unbalanced node from the inserted node because unbalance must happen in the path of the insertion. If there’s no unbalanced node, it’s good. If there’s an unbalanced node (this node must be the nearest unbalanced node from the inserted node because we go up to the tree from the bottom.), we can decide which case we are in by comparing the key values of the unbalanced node and the inserted node. For example, if the key value of the inserted node is larger than the key value of the right child of the unbalanced node, it is the RR case. Once we decide the case, we can call the predefined LL, RR, LR, RL function according to the case.

According to the properties of AVL tree, if we handle the LL, RR, LR, RL case just one time, the tree should be balanced.

(12) AVL removal

The AVL removal function has the same structure with the AVL insertion function. First, remove the node by using BST removal and find the unbalanced node from the parent of the removed node (note that this removed node should be external because we implemented the BST removal function in this way.). After that, we find the unbalanced node from that parent node, decide the case and rotate the nodes according to the case.

Because there’s a possibility that unbalance in the tree cannot be solved although we handle the unbalanced case one time in the removal case, we need to iterate the balancing sequence until the object node reaches to the root.

(13) AVL search

Because AVL tree is the special case of BST, we can just use the same search function with the one we use in the BST case.