Talk:Binary search tree

From Wikipedia, the free encyclopedia

Shouldn't the definition of BST be that every node's right subtree has values greater than or equal to the node's value? The current definition says it's strictly greater, which isn't the case.

You can put equal nodes either on the left or the right, but not both. It's an arbitrary decision. I'll say this somewhere. Deco 01:17, 12 May 2005 (UTC)

I disagree. Here is an excerpt from the lecture I once took at UC Berkeley: "A binary search tree satisfies the _binary_search_tree_invariant_: for any node X, every key in the left subtree of X is less than or equal to X's key, and every key in the right subtree of X is greater than or equal to X's key.". The full lecture can be found here (http://www.cs.berkeley.edu/~jrs/61b/lec/26). I've also verified this with my reference book on algorithms.

I'm sorry, it can be done this way (although there are authoritative references which don't do it this way). It makes some operations more difficult, such as finding all the nodes with a given value, but it really makes almost no difference which way it is defined. In the article I've used your definition, then clarified that in some definitions equal values can be excluded from the left or right side. Deco 01:02, 14 May 2005 (UTC)

By this definition self-balancing binary search trees are not binary search tree. Rotations do not keep this _binary_search_tree_invariant_. To be concrete: Consider an empty AVL tree and insert three equal values. After the rotation there are equal values to both sides of the root node. So you either need to exclude equal values from binary search trees or allow equal values to both sides of a node. E.g: "The left subtree of a node contains only values less or equal to the node's value. The right subtree of a node contains only values greater than or equal to the node's value." 141.84.26.134 01:04, 27 February 2007 (UTC)

It is confusing that the article starts by claiming that equal nodes should always be in the right subtree, while the example image puts equal nodes (such as the seven, which equals the root) in the left subtree. 128.84.153.196 13:29, 29 September 2006 (UTC)

Contents

[edit] Copyright violation?

Why does the blurb about Optimal BSTs have a reference to "section 15.5 of Introduction to Algorithms" ? Is this text copied and pasted out of a book? Is this a (c) violation?

Copyright does not cover ideas but an explicit realization of those ideas (see idea-expression divide). The text here, written by me, was based on my understanding of the material in CLRS and not copied verbatim from that book. Deco 00:41, 15 December 2005 (UTC)

[edit] Why are the examples in Python?

I know that the example algorithms are merely for show - but wouldn't examples in a more mainstream language like C/C++ or Java be better for the purpose of understanding? The Phython deletion algorithm, for instance, uses return structures that are foreign to C/C++ or similar languages.

If you think a mainstream programmer might find them difficult to understand that's a legitimate concern. I agree that "find_remove_max" is quite obscure in its use of tuples of two return values. On the other hand, I think the use of tuples or records combined with pattern matching really abbreviates things like the insertion and search algorithms, while still being pretty clear, which is a pretty good reason to use them instead of C/C++/Java. I'm going to look at rewriting parts of the deletion code in a more intuitive way... er, as soon as I learn a little Python. :-) Deco 22:27, 15 February 2006 (UTC)
I decided to write "destructive" versions of insert and delete in C++, which lends itself quite well to this purpose, as you can see from the relative simplicity of the resulting code. I left in the Python insertion to demonstrate the concept of persistent trees, but took out the excessively complex Python deletion example. I hope this makes things clearer for you! Deco 03:48, 16 February 2006 (UTC)

Yes - thank you very much. The C++ example is much more clear to me - and you are right, it was the returning tuples that was causing my head to spin. --208.186.134.102 06:00, 16 February 2006 (UTC) Wouldn't the code in the new C++ delete example cause memory leaks when you set "node=node->right" and don't delete node->right? (or node->left)... (thanks for adding it... it's a good, simple example. very clean) ++Ryangardner

Good point. I've gotten too used to garbage collection. :-) Thanks for your positive feedback. Deco 06:27, 16 February 2006 (UTC)

I see a few other (minor) flaws in the algorithm on the page. If you are deleting a leaf node, you must remove the link from the parent of that node to the node itself. I also think that the delete method as it is implemented would cause an error, because you are deleting the node and then trying to get data from it... Ryangardner

[edit] Merge with binary search algorithm

It seems to me this article details the same concept as Binary search algorithm. I suggest a merger.He Who Is 01:59, 25 May 2006 (UTC)

No, they are very different. The binary search algorithm is an algorithm that works on sorted arrays. To use it, you need an array and you need to sort it beforehand (which is expensive). The binary search tree is a data structure. Unlike arrays, binary search trees do not preserve order of insertion. Binary search trees are used because they have the advantage that the tree structure encodes the relative ordering of the elements (i.e. it is always "sorted"), without needing to sort. --Spoon! 00:04, 24 August 2006 (UTC)

[edit] Errors

The deletion algorithm should check for the case of both left and right being NULL before checking either side.

Indeed, the deletion doesn't seem to work. Could anybody fix this? Jogers (talk) 15:14, 6 September 2006 (UTC)
Yup, it does seem to be broken. Here is some code that should work. It compiles, but I don't feel up to verifying that it works 100% by actually testing it. It will be better than what's there, though...
 void DeleteNode(struct node* node) {
     struct node **parentLink;
 
     // If there is no parent, we are a root node
     if (parent == NULL) { 
         free(node);
         return;
     }
       
     // Figure out which child of the parent we are
     if (node == node->parent->left)
         parentLink = &(node->parent->left);
     else if (node == node->parent->right) 
         parentLink = &(node->parent->right);
 
     // This will handle cases where both children are null.
     if (node->left == NULL) { 
         *parentLink = node->right;
         free(node); // Free any memory allocated for the node
     } else if (node->right == NULL) {
         *parentLink = node->left;
         free(node); // In C++, delete node could be used here
     } else {
         // Follow the right side of the tree until there is an empty space
         struct node* temp = node->left;
         while (temp->right != NULL) {
             temp = temp->right;
         }
         node->value = temp->value;
         DeleteNode(temp);
     }
 }
Wolever 02:50, 31 March 2007 (UTC)

[edit] Does it necessarily need a total ordering?

C++ STL's "Sorted Associative Containers" (which includes "set", "multiset", "map", and "multimap" classes, all of which are binary search trees) only requires its key types to be a strict weak ordering: http://www.sgi.com/tech/stl/SortedAssociativeContainer.html --Spoon! 00:08, 24 August 2006 (UTC)

You're right, a strict weak ordering suffices to order the tree and enable useful operations. The details behind this are rather complicated though and I think would be best relegated to their own section. Deco 04:09, 24 August 2006 (UTC)

[edit] Optimal binary search trees

In the section about Optimal binary search trees, it is mentioned that:

We can then use a dynamic programming solution, detailed in section 15.5 of Introduction to Algorithms by Thomas H. Cormen Sec Edition, to construct the tree with the least possible expected search cost.

I wish that someone who has the book (or has a nearby library from which he can borrow it) could take the time to write about it.

Thanks. --Hosam Aly 20:26, 7 January 2007 (UTC)

[edit] Insertion

I think the insertion code is not very clear, so I suggest this one, written in pseudocode:

x is current node, y always points to x's parent.

Tree-Insert(T, z)
 y = null                        //x will be set to root, root's parent is null
 x = root(T)
 while x != null do              //we will traverse the tree until we find the leaf
    y = x                        //we are going to change x, so store x as it will became next parent
    if key(z) < key(x)           //if the value of inserted node is lesser then actual node value, move left
      then x = left(x)
      else x = right(x)
                                 //now we should have y pointing to the leaf, and x to null (notice when the while loop terminates)
 parent(z) = y                   //this leaf will become our parent
 if y = null                     //this is a special case, when we are inserting into an empty tree
   then root(T) = z              //so the inserted node will be the root node
   else
     if key(z) < key(y)          //tree is non-empty, so insert the node
       then left(y) = z
       else right(y) = z


Kremso 17:15, 11 January 2007 (UTC)

Heck, why not just re-write all the code in psudo code? It will keep people from directly copy+pasting it in to their code (which generally won't work so well) and it is quite a bit easier to understand. Wolever 02:50, 31 March 2007 (UTC)