Talk:Radix sort
From Wikipedia, the free encyclopedia
quicksort is not n (log n), its n^2, I fixed this
- quicksort utilizing a random pivot can be shown to be (Big Theta)(n*log n) as instead of one decision tree for the program there exist many. As one makes n arbitrarily large, the probability of picking a tree which perfectly matches the sortedness of the data set such that the pivot is always "worst case" (and thus yields a n^2 run) becomes increasingly small. Since when we perform algorithmic analysis we are looking at the limit as n goes to infinity we can state that this probability goes to 0 and thus that quicksort is indeed (Big Theta)(n*(log n)).
Nonsense! You must state in advance whether you like to discus/consider/analyze average, or worst case, or what-ever behaviour!! BTW: When talking Radix-sort one should also mention its generalization, Hash-sort, which is on average asymptotically the fastest known sorting method, O(n), due to the fact that it uses knowledge about the to-be-expected distribution of the keys. With a good chosen hash-function it's run time is C*n + o(n). Enlighting example: sort N uniform and independend distributed random numbers in the halv-open intervall [0,1[ --- or even more instructiv: sort a permutation of the first N natural numbers! Last can be done optimally because of sufficient knowledge of the to-be-expected data.
std::sort is based on quicksort and heapsort, and is O(n(log(n)). Stop knocking quicksort; yes, it's really O(n*n), but std::sort fixes that problem, and for real problems even raw quicksort is theta(nlog(n)). Hashsort is O(2k(n))[1], which is not O(n) for large k. When you compare realistic sorting algorithms that involve radix or hash-based sorting, you must assume both large n and large k. Bucketsort obviously is O(n) when log(n) > k, but it's impractical for k >> log(n), which a major set of problems (some would argue most problems). Hashsort is slower than std::sort for large k with collisions. Now, if you would like a comparison of hashsort vs. Spreadsort or some other practical general algorithm, I'd be happy to make one.Cuberoot31 00:07, 2 December 2006 (UTC)
Does the sample implementation given depend on the endianness used by the computer? I'm not quite sure what exactly the >> operator does. --AxelBoldt
The value of x >> y is floor(x / 2y), provided x and y are non-negative, and y is not too big. So it doesn't depend on endianness in this case. But if x is negative, then the value is undefined, so this is a problem. Another problem with the code in the article is that it assumes that CHAR_BIT is 8. --Zundark, 2001 Dec 11
So maybe we should make them unsigned ints to take care of the sign issue? Is there a clean way to deal with the CHAR_BIT thingy?
I think we should probably have an implementation that deals with keys that are (arbitrary l ength) strings, since that seems to be the more interesting case. --AxelBoldt
I changed the int array to unsigned. The CHAR_BIT thing is more problematic, so I just added a note pointing it out. On most compilers for desktop computers CHAR_BIT is 8, so this isn't too bad a restriction. --Zundark, 2001 Dec 11
Having looked at it more closely, I see that it has more serious problems than those mentioned above, so I've moved it here until it gets fixed. --Zundark, 2001 Dec 11
Contents |
[edit] Sample radix sort of an array of integers
This sample implementation is written in the C programming language.
/* * This implementation sorts one byte at a time, so 32 bit integers get sorted in 4 passes. * It uses a simplified bucket sort to sort on each byte, which requires O(n) memory. * It assumes that CHAR_BIT is 8. */ struct listnode /* a simple linked list data structure */ { /* used for the bucket sort */ unsigned val; struct listnode * next; }; void sort(unsigned * array, int length) { int i, j; unsigned char key; struct listnode nodes[length]; /* an array of preallocated listnodes */ struct listnode * buckets[0x100]; /* the buckets for the bucket sort */ memset(buckets, 0, 0x100 * sizeof(struct listnode *)); /* zero the memory */ for(i = 0; i < sizeof(unsigned); i++) /* iterate over the bytes in an unsigned int */ { for(j = 0; j < length; j++) /* iterate over the array */ { key = (char)((array[j] >> (i * 8)) & 0xFF); /* grab the byte */ nodes[j].val = array[j]; /* put the byte in the bucket */ nodes[j].next = buckets[key]; buckets[key] = &(nodes[j]); } j = length - 1; for(key = 0xFF; key >= 0; key--) /* loop over the buckets */ while(buckets[key] != 0) { array[j] = buckets[key]->val; /* pull the values out of the buckets */ buckets[key] = buckets[key]->next; /* and put them back in the array */ j--; } } }
I have written FORTRAN examples of both the recursive forward radix sort and the nonrecursive reverse radix sort. Both examples sort pointers to fixed length character string arrays. Should I add these examples to the article, or are non-C examples considered obsolete ? StuRat 09:47, 5 August 2005 (UTC)
- Don't see why they would be! Not everyone uses C after all
Is there anyone to write pascal source of radix sort.
[edit] Style Improvement Request
In the section "Incremental Trie-based Radix sort", there is a sentence with this as its subject: "The sorted lists of other non-incremental radix sorting algorithms that sort all of the keys in a batch". I hope its obvious that this needs to be improved. I would do it myself, but I have no idea what it's trying to say. Can someone please attend to this? Thanks. Danielx 16:15, 13 August 2006 (UTC)
OK. I revised that sentence. -Ed Lee 24.12.11.31 03:04, 15 October 2006 (UTC)
[edit] Misleading or incorrect
This article makes very misleading claims about radix sort, primarily by neglecting to distinguish MSD and LSD radix sorts:
- That the algorithm takes "take the least significant digit (or group of bits, both being examples of radices) of each key".
This is true of LSD (least-significant digit) radix sort, not MSD. - That the algorithm runs in time O(n * k).
This is true for uniform data distribution in the LSD case, but it is always true for MSD. I would prefer to say that it runs in time O(B), where B is the size of the input in bytes/bits. It should also be noted that, while comparison sorts get away with O(n log n) comparisons, it's really O(B log n) time, which is more relevant to any comparisons with radix sort. - "In practice, O(n) time is only observed when the number of items is much greater than the size of the key space."
I believe this is a reference to the fact that LSD radix sort is O(n) (O(n * k) with above notation) with sparse inputs, but not in the general case. This is selling radix sort short, since MSD radix sort does not have this shortcoming. - "The greatest disadvantages of radix sort are that it usually cannot be made to run in place, so O(n) additional memory space is needed,"
Not true, at least of MSD. Doesn't even try to qualify usually. Reasonable radix sort implementations (e.g. *BSD's radixsort(3)) are in place. - "and that it requires one pass over the input list for each symbol in the key, so it is very slow for potentially-long keys"
Not true of MSD. MSD is O(1) per byte of input. C.f. comparison sorts, which are absolutely slow for long keys.
Despite my complaint, I do not feel competent enough to fix the article. Does anyone want to step up?
--71.235.102.239 23:21, 12 October 2006 (UTC)
- These all sound like valid criticisms. I believe the most appropriate way of repairing the article is a partial rewrite that acknowledges and separates discussion of the LSD and MSD variants in two sections. Deco 23:44, 13 October 2006 (UTC)
Most of the article was already separated into a discussion of least significant digit radix sorts versus most significant digit radix sorts. One of the earlier contributors was probably only familiar with least significant digit radix sorts, which is why the article originally started off sounding as if least significant digit radix sorts were the only kind that existed. I've more explicitly separated the discussion of least significant digit and most significant digit radix sorts, reformatted the queue example to be easier to read, and made some other changes.
I think that the mention of, "In practice, O(n) time is only observed when the number of items is much greater than the size of the key space," is in reference to the observed execution time of an LSD radix sort processing fixed-length keys relative to the execution time of an O(n*log(n)) time sorting algorithm, because it might not be obvious from just looking at the asymptotic notation that a radix sort might actually be slower than a comparison-based sorting algorithm for a sufficiently small amount of input data. So, I added some discussion of that.
I've communicated with people who insist on defining the length of a linear quantity of data as n*log(n), meaning n keys with log(n) bits per key, so that the time complexity of a radix sort would end up looking like O(n*log(n)), and the time complexity of some comparison-based sorting algorithms would end up looking like O(n*log(n)*log(n*log(n))), but that notational convention is going to be confusing, so I did not want to go into a discussion of notational preferences.
Someone else who has relevant expertise in the area of in-place radix sorting will have to contribute on that topic, since I have not studied that topic and frankly can't see how that would work without allocating additional memory beyond the original memory required to store the input data. The computer would have to perform a lot of exchanges to do a radix sort in place, or perhaps I am misunderstanding what, "in place," means in this context.
No one is omniscient, so I encourage you to contribute what truths you know to the Wikipedia.
-Ed Lee 24.12.11.31 03:41, 15 October 2006 (UTC)
I've worked with a derivative algorithm Spreadsort for some time. Some fixes: Radixsort is O(n*k), or more precisely O(n*(k/s)*2s) for a MSD Radixsort, where s is the size of chunk processed at a time in bits. The main sorting page for that is wrong, and I'm going to fix it. By contrast, comparisons can be quicker than O(k), they are often theta(1). They are only O(k) if usually done between items with very similar keys. Another issue with the assumption that a comparison is just as slow as checking small pieces of a key multiple times is that comparisons are usually running sequentially along a single strip of memory (which is fast), versus looking at pieces of larger chunks in an array multiple times (which is slower). The biggest performance problem with MSD radixsort is the overhead of processing nearly empty bins (say you have 2 items in a bin, and they match to all but the last bit; you could be processing 256 bins each iteration k/s times instead of doing 1 comparison), and the memory overhead of creating bins. LSD's big problem is with long keys, especially strings. All the memory allocation/deallocation is really slow relative to other operations, on top of having to sort items of different length separately.
There is nothing fundamental stopping an "in-place" MSD radixsort as stability is not required; the memory overhead of such an implementation is proportional to 2s. Spreadsort can be thought of as a modified MSD radixsort that switches to comparison-based sorting where radixsort is slow and has cut the problem down.Cuberoot31 08:32, 9 November 2006 (UTC)
I have thought about it a little more, and MSD radix sort doesn't have much going for it when k > log(n). When k < log(n) it can be in-place and thus use less memory than LSD, but MSD's best performance when n < 2k requires data that splits evenly into a single item per bin. Otherwise, if you have 2 items per bin, and they only differ on the last bit, you will recurse through all 2s bins multiple times until you hit the end of the radix, where a comparison-based algorithm would do a single comparison on those 2 and be done. It is important not to forget the overhead of creating, filling, and deleting bins (or whole copies of arrays) in analyzing the relative performance of radix sorts. I have found in practice that memory operations are much slower than all but the slowest comparisons (string < being one slow exception) on a system that has been up long enough to fragment its memory, so it is important to keep this in mind when designing and analyzing radix-based sorts. This is a major reason why radix-based sorts aren't more popular. If you want to use MSD radix sort for its ability to run in parallel, why not use MSD for the first (or first few) iterations to split up the problem into smaller bins, then share those bins out across multiple systems, and have them use LSD radix sort (or Spreadsort). MSD radix sort has poor performance on small bins, which you will eventually get if you recurse long enough, so much so that I consider it an impractical algorithm for general application on its own. If I'm missing something, I'd welcome some more comment. I believe that Spreadsort solves this weakness in MSD radix sort, which is one of the reasons why I developed it. I would welcome some real performance data comparing MSD radix sort versus LSD, std::sort, or Spreadsort on a single processor where n << 2k. If I don't get a reply soon (a week), it sounds like this issue has been resolved, and I would like to take down the dispute tag.Cuberoot31 16:49, 10 November 2006 (UTC)
Cuberoot31's discussion of MSD radix sort seems limited to an array-based MSD radix sort. A trie-based radix sort is a kind of MSD radix sort, and it performs best in terms of lowest memory usage and fastest execution speed when the keys are redundant and sparsely distributed in the key space. The recursive function calls of a more traditional array-based MSD radix sort analogously trace the paths of a trie in stack space in a depth-first traversal. I wrote BRADSORT, which is a kind of trie-based radix sort, and a link to its source code is in the reference section of the Wikipedia radix sort article. You can compile BRADSORT with a C compiler and compare execution times with different quantities and distributions of input data.
Regarding Rspeer's change on 04:53, 2 December 2006, where Rspeer comments, "take out misinformation about big-O notation. You never assume constant memory," I disagree. In the analysis of the time complexity of sorting algorithms, you certainly do assume constant-bounded memory or else you lose all connection to physical reality, as memory accesses no longer happen in constant-bounded time as the size of the memory grows due to the speed of light limitation.
-Ed Lee 24.12.11.31 06:05, 7 December 2006 (UTC)
I agree, the trie-based sorting does improve the worst-case memory usage and runtime for an MSD radix sort. So what do you claim as the worst, average, and best-case performance of your trie-based MSD radix sort? How does it handle the problem of (many groups of) a few items with only slightly different key values of the form: xxxxxxxxxxxxx xxxxxxxxxxxxy xxxxxyxxxxxxx xxxxxxxxxyxxx xxxxxxxxxxyxx in an efficient manner? I don't see how any radix sort can be faster than theta(nk), and in terms of worst-case, a trie-based sort can still have significant overhead due to storing possible next steps that may not be filled. As noted in the link, this can be six times the size of the input data; I'm not sure why it couldn't be higher, but I'd have to delve more into your implementation to see it. So radix-based sorts are good when the key value is short. For long keys, where k >> log(n), comparison-based sorting has the advantage of being able to look along a key linearly through memory (which is fast) as opposed to in a fragmented fashion, piece-by-piece as radix sorts do, and comparison-based sorts can be done without significant additional memory, which can be very slow to allocate and deallocate relative to comparisons. The disadvantage of conventional comparison-based algorithms in this situation is that they still need log(n) comparisons, and if the keys don't differ until near the end, they need to look (linearly) through k bits each comparison. I'll do some runtime comparisons when I get a chance; I believe trie-based radixsorting can have excellent performance under some conditions.Cuberoot31 20:47, 7 December 2006 (UTC)
For a trie-based radix sort, the amount of time spent processing each unit length of input data has a constant upper bound, so the time complexity is O(N), where N is the total length of the input data, whether the strings are fixed in length or variable in length.
BRADSORT v1.50 deals with duplicate strings by storing one instance of a string and a count of the number of times that each string appears to indicate each string's frequency. So, if there are duplicates of a string in the input data which occupy more memory space than the amount of memory required to store a count and one instance of the string, then data compression occurs.
For unique strings, BRADSORT executes the most quickly and uses the least amount of memory space for representing strings as paths in the trie when the unique prefixes are short, such as:
axxxxxxx...
bxxxxxxxx...
cxxxxxxxx...
dxxxxxxxxx...
It does not matter how long the individual strings are if the unique prefixes are relatively short, because the trie only needs to represent the unique prefixes of the strings as paths in the trie in order to distinguish the strings from each other. The worst-case trie space usage occurs when the entire string or almost the entire string defines a unique string prefix, such as:
xxxxxxxxxa
xxxxxxxxxb
xxxxxxxxxc
xxxxxxxxxd
The worst-case space usage of BRADSORT and the longest sorting time occurs in this situation, when strings are identical to each other except for the very last bit or last few bits. The memory space required to store the trie structure can be greater than 6 times the size of the input data. What the exact amount of space is in the worst case depends on the C compiler implementation and how much space it allocates for a node structure. For BRADSORT v1.50, each node structure below the root node corresponds to one bit of a string. For 2 strings that are nearly identical and overlap as paths in the trie except for the very last bit, the worst-case space required to store the trie structure would be approximately:
sizeof(node)*total_number_of_bits_in_strings/number_of_strings +
sizeof(node)*number_of_strings
As I mentioned in the radix sort article, trie-based radix sorts can be used for incremental sorting. If you already have a sorted list and need to insert one new string into it, then you might have to re-sort the entire list with a batch sorting algorithm, which can require more time than inserting a new string into a trie.
Other programs use different trie structures and different methods for minimizing space usage. For example, the people who wrote, "burstsort," avoid trie branching to some extent in order to increase the locality of data references for faster execution and to minimize space usage.
-Ed Lee 24.12.11.31 01:51, 8 December 2006 (UTC)
I agree that tries are useful for reusable data structures, and burstsort is another such algorithm I've been aware of for some time; I read up on it when looking at algorithms with some similarity to Spreadsort (I'm curious as to how burstsort and BRADSORT compare in performance). Many sorting algorithms have a trie equivalent, but for many problems an array is more appropriate; it depends on the application. In particular, rebuilding key values can add a large enough overhead to readout from the trie that lookups on a trie can be slower than even a binary search! (depending on the trie, data type, and whether people wish to look at the key, vs a value stored depending on the key) I'd like to clarify on sorting performance: your O(N) is equivalent to O(nk). Your s for the performance summaries I put on the sorting algorithm page sounds like it is 2, so your worst-case is still basically O(nk). What is different is that since BRADSORT (and other conventional trie approaches such as burstsort) is allocating and deallocating memory, it has a significant number of memory operations, which are generally much slower than processing operations on a per-operation basis (sometimes requiring off-processor communication). Also, O(nlog(n)) comparison-based approaches are really O(nlog(n)) comparisons and slightly less swaps, where swaps can be constant time if pointers are used, and the comparisons are O(k), but comparison operations reading linearly through memory are very fast on conventional modern computers. So it's really O(nk) slow operations for radix sorting versus O(nlog(n)) times (1 slow operation + 1 usually fast O(k) operation) for comparison-based sorting. I found this was true in developing Spreadsort, where comparisons were faster than memory allocation and swaps, which I needed to optimize aggressively. std::sort is a good basis for performance comparison because it is quite fast on modern processors.Cuberoot31 15:17, 9 December 2006 (UTC)
Earlier versions of BRADSORT did rebuild key values from paths in the trie, because the keys were entirely represented as paths in the trie, but BRADSORT v1.50 stores keys separately from the trie structure. Each individual key in BRADSORT v1.50 occupies contiguous memory space to represent the key in its entirety, so no rebuilding of a key is necessary. The trie structure in BRADSORT v1.50 acts more as a hierarchical index to the keys rather than as a complete representation of the keys. Reading out all of the keys with BRADSORT v1.50 is as fast as traversing a sorted linked list, because BRADSORT v1.50 uses a circularly threaded trie.
I don't think that we have a material difference of opinion on what notation to use for expressing the time complexity. If k has a constant upper bound in O(nk), then the expression can also be written as O(n) by the definition of O(). I prefer the more concise notation. We can theorize all that we want about the performance of sorting algorithms, but the bottom line is the performance that is achieved in the real world on real-world applications.
-Ed Lee 24.12.11.31 20:34, 9 December 2006 (UTC)
I do not accept the assumption that k has a constant upper bound; strings can be of infinite length and I've seen real-world sorting problems with keys over 1kB in size (compression is usually appropriate in such cases, but that's a side issue). I think it is appropriate to compare worst-case performance for large n AND large k, otherwise the relative performance of radix-based sorting and comparison-based sorting is difficult to understand. Let's try to keep the article accurate with only the assumptions necessary for a reasonable discussion, and in terms of real-world performance, radix-based algorithms do poorly relative to comparison-based algorithms for large k.Cuberoot31 04:26, 10 December 2006 (UTC)
There is no real computer system with an infinite amount of memory capable of storing a string of infinite length. Debating about which sorting algorithm is better in a situation where there are infinitely long keys is purely academic. If you define the length of the input data to be nk, the number of keys times the length of a key, then it is unsurprising that a radix sort will take O(nk) time to complete. An optimal comparison-based sorting algorithm will take O(nk*log(n)) time to complete with the same data.
I do not accept the generalization that radix-based algorithms perform, "poorly," relative to comparison-based algorithms for large key lengths, k. For incremental sorting, a trie-based radix sort can be faster than a batch sorting algorithm. For sorting many duplicate keys, a trie-based radix sort which represents multiple instances of a key as a count and one instance of the key can use less memory due to the resulting data compression and be faster than other sorting algorithms due to a greater locality of data references and less need or no need for virtual memory which is paged from a hard drive. The same conditions under which a comparison can be done in constant time benefit a trie-based radix sort, when the unique key prefixes are short relative to the lengths of the keys. The same conditions under which a comparison of two strings takes the longest time, when the two strings are identical or nearly identical up to the last bit, also make a trie-based radix sort take the longest time to process the data. Comparison-based sorting algorithms can be faster than radix sorting algorithms when a sufficiently small number of keys are processed. It is conceivable that a hybrid radix/comparison sort may be faster than a pure radix sort when the input data is partitioned into small enough groups by the radix sorting portion that can then be processed by a comparison-based sorting algorithm. -Ed Lee 24.12.11.31 03:39, 11 December 2006 (UTC)
It is correct that comparison-based algorithms will take O(nk*log(n)) worst-case time because of the time for comparisons, but since the comparisons run linearly through memory, a processor can run through even long k quickly, enough to make up for the log(n) for even large n (billions of items), while O(nk) separate operations require random accesses, which are very slow in comparison, and even worse, most radix-based algorithms require allocating and deleting memory, which is horribly slow (multiple orders of magnitude) relative to running linearly though k bytes in order. If modern processors weren't pipelined, this might be different, but pipelining is now a standard optimization in hardware design. Tries are basically equivalent to heaps in terms of being able to incrementally sort; this is a useful capability, I agree, but does not differentiate radix-based sorting from comparison-based sorting. To attain compression, you need overhead bytes, and for some types of data, there will be an expansion. As far as I know, this is a capability only of radix-based algorithms, and when applied it makes them specialized in their utility to problems that can be compressed; the overhead of compression is prohibitive otherwise. So this is a useful capability for a subset of problems, but not all problems. Spreadsort is an example of a hybrid algorithm, as is the more specialized burstsort. These approaches are significantly faster than Quicksort alone, but when compared to std::sort on random data with large k, it becomes a much closer competition (in my testing Spreadsort was still faster than std::sort by 20-30% in my tests). I recommend comparing BRADSORT to std::sort with such a situation. The difficult part is creating a large n, large k testcase. One conceivable way to do this is to write 100MB random bytes in a file (0-255, use rand() % 256), and count every 0 as an end of string. Do not allow writing two zeros in a row to prevent getting a significant number of null strings. Keep track of both runtime and memory usage. I will try to run this test myself when I next get sufficient time.Cuberoot31 20:08, 11 December 2006 (UTC)
With the complicated and varied cache strategies employed by different microprocessors, it's hard to tell in advance how efficient or inefficient random memory accesses are going to be.
Regarding deletions, BRADSORT does not perform any deletions of nodes until the algorithm is finished, although incomplete deletion functions are in the source code. I never got around to updating the deletion functions for the new threaded trie structure in BRADSORT v1.50.
For your test of BRADSORT, you might have to increase the size of the input buffer to be large enough to hold the longest input string that you want:
/* The maximum length, in bytes, of an input string. You can modify this. */
#define MAXSLEN (255)
char s [MAXSLEN+1];
BRADSORT v1.50 defaults to using strings that are terminated by the '\n' character, ASCII value 10. I recommend that you put a '\n' character in s[MAXSLEN] in the main() function in case the input buffer contains no newline characters. Also, change the "MAXSLEN+1" expression in the while() to "MAXSLEN":
s[MAXSLEN] = '\n';
while (NULL != fgets (s, MAXSLEN, stdin))
if ((incomplete=insert10 ((uchar *)s))==1) break;
A more practical application of BRADSORT would be to sort words or count the number of times that each word occurs in some file, such as:
tr -cs [a-z][A-Z] [\012*] <textfile.txt | bradsort f >outputfile.txt
The tr program is available on Unix and Linux systems. This particular example does not deal with words that contain apostrophes or hyphenated words. What it should do is replace all non-alphabetic characters from a text file with the '\n', newline, line feed, character and then squeeze consecutive instances of the '\n' character into one character so that words are separated from each other on separate lines. The output then gets piped to BRADSORT with the "f" option which will display the frequency of each word next to each word in the output file. The frequency is represented as a count of the number of times that each word appears in the input.
-Ed Lee 24.12.11.31 02:58, 12 December 2006 (UTC)
[edit] Constant memory
Ed, I really don't believe it's useful to assume constant memory when talking about the asymptotic behavior of an algorithm. That would make every algorithm run in O(1) space and O(1) time, if you think it through. If you disagree, point me to an algorithms textbook that assumes constant memory.
The only thing close to that that I can think of is something you may have touched on: sometimes you do assume each memory access happens in constant time, just to simplify things, even though it's not true. rspeer / ɹəədsɹ 16:17, 9 December 2006 (UTC)
O() notation was originally used in the study of pure math, where infinite domain intervals are commonly considered, but O() is still useful in the study of the relative performance of algorithms within a finite interval. No definition of O() that I have seen restricts its use to infinite intervals, just sufficiently large intervals:
Suppose f and g are two real-valued functions defined on real numbers. We write f=O(g) when there are constants c and N such that f(n)≤cg(n) for all n≥N .... It means that when n is large enough, some constant multiple of g(n) is an upper bound on the size of f(n).
(Christopher J. Van Wyk, Data Structures and C Programs, p. 35, ISBN 0-201-16116-8.)
There is no problem with using O() notation in O(1) space and O(1) time. You just don't abstract the notation down to that level where all time complexities and space complexities on real computers can be expressed as O(1). Allowing a computer's memory to be infinite in size, as I previously stated, loses touch with physical reality, and then O() is no longer as useful as an estimate of how much physical time an algorithm will require to complete.
-Ed Lee 24.12.11.31 19:50, 9 December 2006 (UTC)