Talk:Bogosort

From Wikipedia, the free encyclopedia

Contents

[edit] bogosort is bogo-sort

http://catb.org/~esr/jargon/html/B/bogo-sort.html it should be bogo-sort with a redirect. --12.214.77.95, 22:22 19 May 2005

NIST's Dictionary of Algorithms and Data Structures says bogosort. --Zundark 08:27, 20 May 2005 (UTC)

[edit] Bogosort, half bogus, half not.

The half that is bogus, is of course the Python code and bash code that use a sorting algorithm to perform... a sorting algorithm! It's not hard to check for sorted-ness, and it's most definately not the same as sorting the items!

Please change them to use loops like these:

for( i=1; i < size; i++) {
  if( array[i-1] > array[i] )
    return false;
}
return true;

(above assumes zero-based arrays)

I'm not good enough with python syntax to write it myself, and I rather despise bash syntax, so I'm not going to do that one either. :) (sorry!)

The half that is not bogus is the algorithm itself. First, it is an extremely poor algorithm, even worse than bubble sort and it's ilk (all running at O(n^2)). This is actually useful, because you can compare it to other algorithms. For sorting routines we have bogosort at O(n*n!), and then a class of routines at O(n^2), and then another class at O(n*log n). (I've skipped shell sort cause it's O(n^i) where 'i' is determined by the shells you choose and I don't remember offhand what the smallest value is. I've also skipped ones that operate in O(n), but only in special cases).

Additionally, bogus sort is a good example (if poor running) at a probablistic algorithm. There are several algorithms that rely on randomness to complete, and this is a legitimate (again, if slow) member. Just because we wouldn't use it in real code doesn't mean it's not of academic importance.

And -because- it does use random numbers, and happens to have such a high complexity, it shows off an important lesson in random number generators: They break! If you have a random number generator that has a cycle length of at 2^32, then you can only be guaranteed that bogosort will complete with n less than 12! If you move up to a 2^64 cycle it manages to work for n up to 19!

You may say that this is still impractical, but realize that permutations often grow to these high numbers too, and if you start using random number generators to simulate things, you can only get the same number of possibilities as you random number generator can make. I personally think that bogosort (while impratical in it of itself) drives home that point fairly well.

Also, I don't agree that it's necessarily the worse case; because, afterall you could design an algorithm that pathologically chooses a worse way to sort the array. Though I suppose at some point you'd have to call foul, and say that you're just being silly... (Afterall, bogo sort doesn't even have a gauranteed worse case time, and it also doesn't gaurantee that each operation moves you "closer" to a sorted array -- though that it -does- work still, which makes it a little cool, no?)

--Gryn 03:49, 26 August 2005 (UTC)

[edit] python and bash

I have moved them out of article, because they use external sorting routines. Someone needs to add some is_sorted function. Exe 20:02, 22 November 2005 (UTC)

[edit] Python

from random import shuffle

def bogosort(seq):
    while seq != sorted(seq):
        shuffle(seq)

[edit] Bash Script

#!/bin/bash

# Prints all the lines in their current order.
function echolines
{
  for ((i=0;i<$NUMLINES;i++))
  do
    echo "${LINES[i]}"
  done
}

IFS=$'\n'

LINES=($(cat $1))
NUMLINES=${#LINES[@]}

while ! echolines | sort -c >/dev/null 2>/dev/null
do
  for ((i=0;i<$NUMLINES;i++))
  do
    #Shuffle the lines
    TEMP="${LINES[i]}"
    RAND=$((RANDOM%NUMLINES))
    LINES[i]="${LINES[RAND]}"
    LINES[RAND]="$TEMP"
  done
done

echolines

[edit] Bozo sort merged here

See Wikipedia:Articles for deletion/Bozo sort. Johnleemk | Talk 08:34, 28 January 2006 (UTC)

[edit] Running Time Bozosort

Actually, I've thought about this, and came to the conclusion that bogosort and bozosort are completely equally efficient. If the worst-case running time is infinite, then it does not matter even the least how much progress the algorithm does on one iteration. In fact, taken completely literally, even a busy-loop whose payload did absolutely nothing whatsoever would be equally efficient, at least considering the worst-case running time. JIP | Talk 21:04, 28 January 2006 (UTC)

Worst case running time is not the only suitable metric for measuring efficiency; see Best, worst and average case.--Malcohol 13:34, 31 January 2006 (UTC)
Indeed; when you're discussing a random algorithm of this sort, the worst-case will always approach infinity. More interesting are the best and average cases. The best should be fairly trivial to calculate. For bogosort, it is 1 randomization and n-1 comparisons to discern that the list is ordered. Same for bozosort; 1 swap, and n-1 comparisons. The average time is trickier; I can't actually figure out which one would be better. I suppose, unless some master of probability wants to take a crack at calculations, we could try some experimentation? It would require pretty small data sets, so I'm not sure how useful that would be... ~ Booya Bazooka 19:20, 26 May 2006 (UTC)
Note that Bogo Sort and Bozo Sort are instances of Las Vegas Algorithms. Thus it makes no sense to speak about the worst or average case running time (both depend on the outcomes of random experiments); The thing being of interest is the expected value of the running time, in the best, worst, or average case. BTW I have analyzed the expected number of swaps and comparisons needed on the average by Bogo Sort and Bozo Sort. For the analysis, I've used the setup where all array elements are different. For Bogo Sort the expected number of comparisons/swaps in the worst case is also done. The worst case for Bozo Sort seems to be more difficult. I'll typeset the analysis and be back with a reference. For the moment, I note that the running time of Bozo Sort outperforms the one of Bogo Sort on the average by a linear factor. In the best case for both algorithms, the array is already sorted, the (expected) number of swaps is zero, and the (expected) number of comparisons equals n-1. Hermann Gruber
I'd suggest that the notion of average running time is quite clearly talkable. An average of the running times of N trials is sensible, provide that no run chances upon an infinite time. The expected time is the summation of all possible running times multiplied by their probability of occurrence, clearly the average if all probabilities are equal. The worst cases of infinite times (corresponding to always unfortunate random choices) can make the summation diverge, if the probability of such sequences does not fall faster than the time cost rises so that the contribution remains finite. Put another way, for a set of values of size N, a graph of probability vs. running time could be deduced, the question being whether the area under it is finite, especially considering the long tail towards high running times.

For instance, the famous QuickSort runs in order N.Log(N) time on average, but the worst case is order N**2. There are two levels of randomness: the first in preparing the input values, and in Bogo/Bozo, a second level of randomisation is used in the sorting algorithm. The analysis of QuickSort normally need consider only the first level, but a variant has the selection of pivot element be at random, specifically to dodge the preparation of carefully-chosen worst-case-provoking data sets. Though some other data set might provoke a worst-case performance for the specific random sequence of pivots that will be chosen which data set would have been sorted without trouble by a non-random (or different random) selection of pivots. Now I'm becoming confused! NickyMcLean 21:52, 24 September 2006 (UTC)

What you suggested looks quite similar to the expected value of the running time. By letting the number of trials N go to infinity, I think you'll get the expected running time (at least when disregarding infinite runs); infinite times (corresponding to always unfortunate random choices) as you suggest occur with probability 0, by the infinite monkey theorem. Nevertheless, it is possible that they occur, as is the case in general with Las Vegas Algorithms. You can still make a distinction between expected running time on a worst case input and expected time on an average input, which may be the source of your confusion: Randomized QuickSort has expected running time O(n log n) on *all* n! possible inputs, thus the expected running time on the average is (n! O(n log n)) /(n!) = O(n log n) -- the numerator is the sum of the expectations on all inputs of size n, the denominator is the number of inputs of size n. There we took the average kind of twice, once by computing the expectation, and once by taking the average. Hermann Gruber

[edit] Wikisource

Just a short note that Wikisource is not a dumping ground for code. Can we please refrain from doing this? - Ta bu shi da yu 07:35, 29 January 2006 (UTC)

[edit] No discussion of the O(N) variant

In Python:

from random import shuffle
import UniverseTools

def is_sorted(seq):
    for i in range(len(seq) - 1):
        if seq[i] > seq[i+1]: 
            return False
    return True

def bogosort(seq):
    shuffle(seq)
        if not is_sorted(seq):
            UniverseTools.destroy()

You will need to write UniverseTools is C or Pyrex (it's not possible in Python).

Please do not try this unless you

  1. have a suitable random number generator, and
  2. are certain that the Many Worlds Interpretation is true.

69.107.82.95 18:58, 12 March 2006 (UTC)

[edit] Puzzlement

From the article:

while this algorithm is O(n) in time, permuting the list requres that we consume of O(n log n) bits of quantum randomness.

Could someone explain this and/or provide a citation? Thanks, 68.147.56.203 06:37, 3 September 2006 (UTC)

Someone corrected a mistype. All very well, but I also still don't understand what it means. NickyMcLean 21:16, 24 September 2006 (UTC)

I don't know a citation off-hand, but there aren't any particularly new ideas involved. There are O(2^(n log n)) permutations of an n item list, therefore specifying one particular permutation requires O(n log n) bits. This is why you can't build a (non-quantum) sorting algorithm that needs fewer than O(n log n) comparisons, because each comparison gives you only one bit. In the bogosort case, you need to ensure that every permutation may be generated by your random shuffle, which means you need O(n log n) bits of randomness, which means you need to, say, observe the spin of O(n log n) separate particles or whatever you're using to get quantum randomness. And when doing quantum stuff, randomness is a resource, just like power or memory or whatever -- once you've observed those particles they aren't random anymore, the next time you want some randomness you'll need to find some new particles. 66.159.194.130 11:28, 12 November 2006 (UTC)

[edit] Quantum

For those inputs, observers would be surprised by mysterious failures of the computer or improbable accidents preventing its operation because all universes in which it did operate are destroyed.

This must be based on some misunderstanding of how quantum physics and quantum computers work. --Apoc2400 06:53, 13 March 2007 (UTC)

[edit] When is bogosort used?

The article says that bogosort is used only for educational purposes, but I rather doubt this. Bogosort must be the method of choice whenever one does not have access is individual elements. For example, you're trying to sort some long chain-like molecule. All you can do is bombard it to break it apart, and then see how it recombines. In this sense, Bogosort is related to Pancake sort. It's a bad sorting algorithm generally, but either may become the method of choice when access is restricted in a some way.

I'm curious also whether bogosort is reported in the psychological literature. Do children ever use this sorting algorithm? --Dale Gerdemann 13:34, 9 November 2007 (UTC)

In my Information Structures class, while they didn't mention bogosort specifically, they did say one point of the sorting sections was to demonstrate how algorithms can range widely in efficiency. they started by showing how three algorithms to square a number could be written basically as poorly as you would like to:
1) answer = n^2; O(1)
2) for n times( answer += n;) O(n)
3) for n times( for n times( answer += 1; ) ) O(n^2)
while they used bubblesort as the laughing stock of sorting algorithms, i wouldn't be surprised to see bogosort used somewhere else. Sahuagin (talk) 02:34, 24 November 2007 (UTC)

[edit] <blank>

What is the purpose of this page? Bogosort is a stupid idea and should remain in the jargon file. This article reduces the credibility of the other articles of Wikipedia. NTF Oct 6 2002.

This won't reduce the credibility of any other page on Wikipedia. It covers a very real concept that has some interesting properties. The information presented on this page is, as far as I can see, factual. Bogosort exists. It might be completely useless for any practical purpose, but the same could be said of Lord of the Rings. --nknight 12:27 Feb 16, 2003 (UTC)
Bogosort does exist. I first learnt about it in a computer science lecture at university! Martin
This page is very useful. I used the information on this page to do my AP Computer Science homework. Although there are many sorting algorithms that are much better, for example Quicksort, Bogosort teaches new students to Java about sorting an array. It best shows the difference between good sorting methods and the bad ones. Bdodo1992 01:43, 3 October 2007 (UTC)

[edit] <blank>

The code is incorrect --NTF Oct 6 2002.

Incorrect C code moved here. Maybe someone can be bothered to fix it, but I can't. --Zundark 12:46 Feb 16, 2003 (UTC)

Checking the edit history, two days after NTF noted there was a bug, one was corrected. I can't personally verify the code, either, but presumably his statement made Wesley go through and do so. -- nknight 12:50 Feb 16, 2003 (UTC)
It's not correct. Whoever wrote it doesn't even know the difference between C and C++. --Zundark 12:54 Feb 16, 2003 (UTC)
OK, fine, I'll take a look at it in detail when I'm actually awake. -- nknight 12:56 Feb 16, 2003 (UTC)

How about this? I've verified its operation with up to ten items. A bit more verbose than the C++ implementation. -- nknight 03:06 Feb 17, 2003 (UTC)

void bogosort(int *data, int length) {
    int i, j, tmp;
    int *visited = malloc(length*sizeof(int));
    srand(time(0));
    while (!sorted(data, length)) {
        memset(visited, 0, length);
        for (i = 0; i < length; ++i) {
            j = rand() % length;
            if (!visited[j]) {
                tmp = data[i];
                data[i] = data[j];
                data[j] = tmp;
                visited[j]++;
            }
        }
    }
}

int sorted(int *data, int length) {
    int i;
    for (i = 0; i < length-1; ++i)
        if (data[i] > data[i+1])
            return 0;
    return 1;
}
The memset is wrong (there are length*sizeof(int) bytes), but you don't need the visited array anyway - just exchange data[i] with data[i+rand()%(length-i)] in the loop. (Of course, rand()%(length-i) won't be quite uniform if length-i doesn't divide RAND_MAX+1, so this isn't really correct either.) I prefer the C++ implementation I gave, as it doesn't have to deal with the irrelevant details of how to implement a random shuffle properly. --Zundark 10:42 Feb 17, 2003 (UTC)

In C++ bogosort can be formulated really concisely:

#include <algorithm>

template<class T>
void bogosort(T begin, T end) {
  while (!is_sorted(begin, end)) { random_shuffle(begin, end); }
}

For someone who knows the language this is really easy to understand. On the other hand, somebody without C++-knowledge might be puzzled by the use of iterators. Any comments? MH 00:02, 9 Nov 2003 (UTC)

I think it's so short and simple, you might as well put it in. Martin 00:26, 9 Nov 2003 (UTC)

Be-yooo-ti-ful! I'd make it non templated and have it operate on ints though. The templating just obscures the point, and will really obscure it for non C++ coders (and for C coders who will puzzle over the use of operator< to compare any type, including strings -- "where's the callback function pointer???"). orthogonal 18:14, 16 Nov 2003 (UTC)

If all we want to do is give the algorithm we can as well put it in pseudocode:
while not is_sorted(array)
   array := random_permutation(array)
This emphasises the simplicity of the algorithm and avoids digging into language details. If readers want to learn about templates or about C++ they'll look at other places anyway. MH 12:53, 24 Nov 2003 (UTC)

[edit] <blank>

Just to say: I am a C++ programmer and this page has a very real use. It is well documented in the programming press. It should be remembered that computer-types often have a wry sense of humour that may not be understood by all. Nevertheless, worst-case algorithms are very useful to know about. --/Mat 15:12, 24 Mar 2004 (UTC)

[edit] Random sort?

I've always heard this algorithm called random sort in computer textbooks everywhere. I'm adding that as a secondary usage, but I suspect it may be a main one. Anyone have thoughts on that? - Chardish (talk) 22:02, 9 December 2007 (UTC)

[edit] Worst case

The article claims that the worst-case time is (n-1)×n!. However, it's clear that the worst-case time is unbounded, for the same reason that a coin may land on heads an unbounded number of times in a row before it lands on tails (even if, giving infinitely many trials, it is almost certain to land on tails). I'm changing it. Dcoetzee 21:09, 10 February 2008 (UTC)

Well, that's decisive of you. There is a "running time" section above that you could have noticed. The argument is not as obvious as you might like to think. Suppose an algorithm is to toss a (fair) coin and stop on occurrence of a Head. The expected number of iterations is easily computed, namely 2. However the upper bound is not so clear. Yes, every toss might yield a tail, but the probability of an infinite number of tails happening (chant "Limit, n → infinity") is zero. So, it won't happen? That's what a probability of zero means. Part of the problem is that this requires an infinity in completion, not merely in potential. Still, it is probably better to say that there is no upper bound on the number of iterations, if an upper bound is to be talked about at all. One could give figures for the probability that the number of iterations exceeds x, and clearly F(x) → 0 as x → infinity, but it never arrives at zero itself.
It's certainly true that it has probability 1 of succeeding in an infinite number of trials - but the worst-case time is unbounded in the sense that for any runtime t, it can take longer than t, although with diminishing probability as t becomes very large. I think we're in agreement. Dcoetzee 19:24, 11 February 2008 (UTC)

The last time I looked at this I found that the chance of success after N! shuffles of N elements is (1 - 1/e)NickyMcLean (talk) 02:33, 11 February 2008 (UTC)