Community Sites
Create your own community website and start earning today !
It's Free !
 
Communities Members BookmarksPolls Fresher Jobs Funny Pictures MCA Projects New Member FAQ  



My Profile
Active Members
TodayLast 7 Days more...



Awards & Gifts
Online Exams

Fresher Jobs


Our fresher job section is exclusively for fresh graduates! Find jobs for freshers in major Indian cities including Bangalore, Chennai, Hyderabad, Pune or Kochi

Resources


Find educational articles, blogs, discussion threads and other resources.

Colleges


Find details about any college in India or search for courses.

website counter



Btree


Posted Date: 17 Mar 2008    Resource Type: Articles/Knowledge Sharing    Category: General

Posted By: Aparanjitha       Member Level: Gold
Rating:     Points: 5



In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. It is most commonly used in databases and filesystems.

In B-trees, internal nodes can have a variable number of child nodes within some pre-defined range. When data is inserted or removed from a node, its number of child nodes changes. In order to maintain the pre-defined range, internal nodes may be joined or split. Because a range of child nodes is permitted, B-trees do not need re-balancing as frequently as other self-balancing search trees, but may waste some space, since nodes are not entirely full. The lower and upper bounds on the number of child nodes are typically fixed for a particular implementation. For example, in a 2-3 B-tree (often simply referred to as a 2-3 tree), each internal node may have only 2 or 3 child nodes.

A B-tree is kept balanced by requiring that all leaf nodes are at the same depth. This depth will increase slowly as elements are added to the tree, but an increase in the overall depth is infrequent, and results in all leaf nodes being one more hop further removed from the root.

B-trees have substantial advantages over alternative implementations when node access times far exceed access times within nodes. This usually occurs when most nodes are in secondary storage such as hard drives. By maximizing the number of child nodes within each internal node, the height of the tree decreases, balancing occurs less often, and efficiency increases. Usually this value is set such that each node takes up a full disk block or an analogous size in secondary storage. While 2-3 B-trees might be useful in main memory, and are certainly easier to explain, if the node sizes are tuned to the size of a disk block, the result might be a 129-513 B-tree.

A B-Tree of order m (the maximum number of children for each node) is a tree which satisfies the following properties :

1. Every node has <= m children.
2. Every node ( except root and leaves ) has >= m/2 children.
3. The root has at least 2 childr en.
4. All leaves appear in the same level, and carry no information.
5. A non-leaf node with k children contains k – 1 keys


The B-tree's creators, Rudolf Bayer and Ed McCreight, have not explained what, if anything, the B stands for. The most common belief is that B stands for balanced, as all the leaf nodes are at the same level in the tree. B may also stand for Bayer, Branching Tree, or for Boeing, because they were working for Boeing Scientific Research Labs at the time.

Node structures

Each internal node's elements act as separation values which divide its subtrees. For example, if an internal node has three child nodes (or subtrees) then it must have two separation values or elements a1 and a2. All values in the leftmost subtree will be less than a1 , all values in the middle subtree will be between a1 and a2, and all values in the rightmost subtree will be greater than a2.

Internal nodes in a B-tree — nodes which are not leaf nodes — are usually represented as an ordered set of elements and child pointers. Every internal node contains a maximum of U children and — other than the root — a minimum of L children. For all internal nodes other than the root, the number of elements is one less than the number of child pointers; the number of elements is between L-1 and U-1. The number U must be either 2L or 2L-1; thus each internal node is at least half full. This relationship between U and L implies that two half-full nodes can be joined to make a legal node, and one full node can be split into two legal nodes (if there is room to push one element up into the parent). These properties make it possible to delete and insert new values into a B-tree and adjust the tree to preserve the B-tree properties.

Leaf nodes have the same restriction on the number of elements, but have no children, and no child pointers.

The root node still has the upper limit on the number of children, but has no lower limit. For example, when there are fewer than L-1 elements in the entire tree, the root will be the only node in the tree, and it will have no children at all.

A B-tree of depth n+1 can hold about U times as many items as a B-tree of depth n, but the cost of search, insert, and delete operations grows with the depth of the tree. As with any balanced tree, the cost grows much more slowly than the number of elements.

some balanced trees store values only at the leaf nodes, and so have different kinds of nodes for leaf nodes and internal nodes. B-trees keep values in every node in the tree, and may use the same structure for all nodes. However, since leaf nodes never have children, a specialized structure for leaf nodes in B-trees will improve performance.


Algorithms

Search

Search is performed in the typical manner, analogous to that in a binary search tree. Starting at the root, the tree is traversed top to bottom, choosing the child pointer whose separation values are on either side of the value that is being searched.

Binary search is typically (but not necessarily) used within nodes to find the separation values and child tree of interest.


Insertion

A B Tree insertion example with each iteration.
A B Tree insertion example with each iteration.

All insertions happen at the leaf nodes.

1. By searching the tree, find the leaf node where the new element should be added.
2. If the leaf node contains fewer than the maximum legal number of elements, there is room for one more. Insert the new element in the node, keeping the node's elements ordered.
3. Otherwise the leaf node is split into two nodes.
1. A single median is chosen from among the leaf's elements and the new element.
2. Values less than the median are put in the new left node and values greater than the median are put in the new right node, with the median acting as a separation value.
3. That separation value is added to the node's parent, which may cause it to be split, and so on.

If the splitting goes all the way up to the root, it creates a new root with a single separator value and two children, which is why the lower bound on the size of internal nodes does not apply to the root. The maximum number of elements per node is U-1. When a node is split, one element moves to the parent, but one element is added. So, it must be possible to divide the maximum number U-1 of elements into two legal nodes. If this number is odd, then U=2L and one of the new nodes contains (U-2)/2 = L-1 elements, and hence is a legal node, and the other contains one more element, and hence it too is legal. If U-1 is even, then U=2L-1, so there are 2L-2 elements in the node. Half of this number is L-1, which is the minimum number of elements allowed per node.

An improved algorithm supports a single pass down the tree from the root to the node where the insertion will take place, splitting any full nodes encountered on the way. This prevents the need to recall the parent nodes into memory, which may be expensive if the nodes are on secondary storage. However, to use this improved algorithm, we must be able to send one element to the parent and split the remaining U-2 elements into two legal nodes, without adding a new element. This requires U = 2L rather than U = 2L-1, which accounts for why some textbooks impose this requirement in defining B-trees.


Deletion

There are two popular strategies for deletion from a B-Tree.

* locate and delete the item, then restructure the tree to regain its invariants

or

* do a single pass down the tree, but before entering (visiting) a node, restructure the tree so that once the key to be deleted is encountered, it can be deleted without triggering the need for any further restructuring

The algorithm below uses the former strategy. The latter strategy appears in pseudo-code in Introduction to Algorithms (see below).

There are two special cases to consider when deleting an element:

1. the element in an internal node may be a separator for its child nodes
2. deleting an element may put it under the minimum number of elements and children.

Each of these cases will be dealt with in order.

Deletion from a leaf node

* Search for the value to delete.
* If the value is in a leaf node, it can simply be deleted from the node, perhaps leaving the node with too few elements; so some additional changes to the tree will be required.

Deletion from an internal node

Each element in an internal node acts as a separation value for two subtrees, and when such an element is deleted, two cases arise. In the first case, both of the two child nodes to the left and right of the deleted element have the minimum number of elements, namely L-1. They can then be joined into a single node with 2L-2 elements, a number which does not exceed U-1 and so is a legal node. Unless it is known that this particular B-tree does not contain duplicate data, we must then also (recursively) delete the element in question from the new node.

In the second case, one of the two child nodes contains more than the minimum number of elements. Then a new separator for those subtrees must be found. Note that the largest element in the left subtree is the largest element which is still less than the separator. Likewise, the smallest element in the right subtree is the smallest element which is still greater than the separator. Both of those elements are in leaf nodes, and either can be the new separator for the two subtrees.

* If the value is in an internal node, choose a new separator (either the largest element in the left subtree or the smallest element in the right subtree), remove it from the leaf node it is in, and replace the element to be deleted with the new separator.
* This has deleted an element from a leaf node, and so is now equivalent to the previous case.


Rebalancing after deletion

If deleting an element from a leaf node has brought it under the minimum size, some elements must be redistributed to bring all nodes up to the minimum. In some cases the rearrangement will move the deficiency to the parent, and the redistribution must be applied iteratively up the tree, perhaps even to the root. Since the minimum element count doesn't apply to the root, making the root be the only deficient node is not a problem.

The strategy is to find a sibling of the deficient node which has more than the minimum number of elements and of the separator and the values in both nodes as the new separator and put that in the parent.


o Redistribute the remaining elements to the right and left children.
o Redistribute the subtrees of the two nodes to parallel the redistribution of the elements. The subtrees themselves are transplanted entirely, and are not altered if moved to a different parent node, and this can be done as the elements are redistributed.
* If the sibling node immediately to the right of the deficient node has only the minimum number of elements, examine the sibling node immediately to the left.
* If both immediate siblings have only the minimum number of elements, create a new node with all the elements from the deficient node, all the elements from one of its siblings, and the separator in the parent between the two combined sibling nodes.
o Remove the separator from the parent, and replace the two children it separated with the combined node.
o If that brings the number of elements in the parent under the minimum, repeat these steps with that deficient node, unless it is the root, since the root may be deficient.




Responses

Author: Deepu    19 Mar 2008Member Level: Diamond   Points : 2
Good work Aparanjitha.Keep going with your postings.Its really very nice.your cs knowledge is very good.keep on posting these kind of articles.These will be help ful to many IT students.


Author: Aparanjitha    19 Mar 2008Member Level: Gold   Points : 3
thanks for ur suggestions and encouragement keep on reading my resoucres and give me some more advises . As you said before i want to gain knowledge in all fields and place resources that will be help ful to non cs . Thank you



Feedbacks      
Popular Tags   What are tags ?   Search Tags  
(No tags found.)

Post Feedback


This is a strictly moderated forum. Only approved messages will appear in the site. Please use 'Spell Check' in Google toolbar before you submit.
You must Sign In to post a response.
Next Resource: Top five causing cancer
Previous Resource: Networking
Return to Discussion Resource Index
Post New Resource
Category: General


Post resources and earn money!
 
Related Resources

Watch TV Channels



Contact Us    Editors    Privacy Policy    Terms Of Use   

SpiderWorks Technologies Pvt Ltd. 2006 - 2007 All Rights Reserved.