Talk:Kolmogorov complexity

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Computer science.
??? This article has not yet received a rating on the assessment scale.
??? This article has not yet received an importance rating on the assessment scale.

Contents

[edit] Spelling

My bot semiautomatically fixed a mistake in this page, see [1] doing the replacement

parameterisation ->
parametrisation

I got a an anonimous message today that the former is also valid spelling. Is that true? I myself would think that it is not in this context. Oleg Alexandrov 6 July 2005 15:02 (UTC)

I think your bot spells correctly, although I am an American, and would use a 'z' instead of an 's'. The former spelling looks incorrect to me in any case. -- 130.94.162.61 23:59, 21 February 2006 (UTC)
See these pages [2] [3] in Merriam-Webster Online Dicrinary. :-) CiaPan 01:17, 22 February 2006 (UTC)

[edit] Opening definition

The first sentence reads as follows:

In computer science, algorithmic information theory is a field of study which attempts to capture the concept of complexity by using tools from theoretical computer science.

Typically the first sentence of an article is some definition or a thumbnail sketch of one. The above statement is not. Additionally, it could have equally applied to the better known field of Computational complexity theory.

I plan to fix this as time permits. Vonkje 22:42, 23 July 2005 (UTC)


[edit] Suggestion to move

I propose that this page be moved to Kolmogorov complexity. This is the main subject of the article and would simplify the article's introduction.--CSTAR 01:56, 3 October 2005 (UTC)


[edit] Compression

"incompressible strings must exist, since there are 2n bit strings of length n but only 2n − 1 shorter strings, that is strings of length n − 1."

I'm no expert, but I don't think this makes sense... there must be something like 2(n − 1)! shorter strings (more than 2n for n > 3), and there is no reason to expect a string to be compressed in exactly one less symbol. Can anyone restate this proof more clearly?

Take a set ("an alphabet") of A symbols. There is AN different N-letter strings (sequences of length N with values from that set). For binary strings (two-symbol set, A=2) there is 2N strings.
There is, of course 2N − 1 strings of length N-1, 2N − 2 strings of length N-2, 2N − 3 strings of length N-3, and so on, down to 21 = 2 strings of length 1. These terms add up to the sum known as a finite geometric series:
\sum_{n=1}^{N-1}2^n=2^1\cdot\frac{1-2^{N-1}}{1-2}=2\cdot(2^{N-1}-1)=2^N-2
which is less than the number of N-bit strings.
Actually it may be enough to compare just the number of different strings of length N and N-1, because the same problem arises on every N level — if you want to compress N-bit strings into strings shorter than N-1, then you use some subset of compression function's codomain, that might be occupied by compressed N-1-bit strings. But whichever way you calculate, you will get the same result: for every N>0 there is more strings of length N (or shorter) to be compressed than strings of length N-1 (or shorter), that they might be compressed into.
CiaPan 07:55, 8 February 2006 (UTC) (Hope my explanation is clear enough - I'm not as fluent in English as in maths.) image:smile.png


What about zero length strings? therefore sum from 2N − 1 to 20


\sum_{n=0}^{N-1}2^n=\frac{1-2^{N}}{1-2}=2^{N}-1


The fact that there are more strings with exactly length N than all of the strings with length < N is not immediately obvious. Although the quoted sentence is correct ie there are only 2N − 1 strings of length N-1, the use of the word shorter makes this confusing.
I suggest
"incompressible strings must exist, since there are 2n bit strings of length n but only 2n − 1 shorter strings, that is strings of length n − 1 or less."
Yes No?
Spacepickle 16:45, 31 May 2006 (UTC)


Short answer: If you feel your version is more clear, insert it. Be bold.
Explanation: In the previous notes, when I used '(or shorter)' I meant 'whether you count N-bit and (N-1)-bit strings or all strings with length up to N and up to (N-1) bits, you get the same result'. In other words I tried to merge two sentences in one: 'there is more binary strings with length N than strings with length (N-1), and by recursion, there is more strings with length N or less than strings with length (N-1) or less'.
So it does not matter, whether you consider only strings with given length, or include also all shorter string—there is always more strings to be compressed, than possible results of compression.
More comments: There are three kinds of mathematicians: those who can count, and those who don't...
And among those who can, there are 10 kinds: those who understand binary, and those who don't.
For those who can count and understand binary, the proposition you expressed above—that there is more strings of length N than all strings shorter than N— is immediately obvious.
For length N There is # strings of length
exactly N any, less than N
binary decimal binary decimal
0 1 1 0 0
1 10 2 1 1
2 100 4 11 3
3 1000 8 111 7
4 10000 16 1111 15
... ... ... ... ...
N 1 and N zeros 2N N ones 2N-1
because there is exactly one (1) empty string and no (0) strings shorter than that:
NumEqual( 0 ) = 1
NumShorter( 0 ) = 0
which is displayed by the first row in the table, and for any natural N:
NumEqual( N+1 ) = 2 × NumEqual( N )
NumShorter( N+1 ) = NumShorter( N ) + NumEqual( N ).
And these recurssions resolve to what the last row of table shows. --CiaPan 17:00, 1 June 2006 (UTC)

[edit] Some issues

Why does Talk:Algorithmic information theory forward to Talk:Kolmogorov complexity?

Fixed. The following comment copied over to appropriate talk page.--CSTAR 17:31, 17 June 2006 (UTC)

The following remarks pertain to the article Algorithmic Information Theory. The article as it stands has some issues that I would like to bring to discussion. I hope that some consensus can be reached about how to deal with them.

  • There are no references.
  • Komogorov complexity is an active research area in mathematics as well as computer science.
  • The statement
Unlike classical information theory, algorithmic information theory gives formal,
rigorous definitions of a random string.....

is a point of view that is not universally shared, although it has been championed by Chaitin in popularizations of the area. It is well known that the AIT definition of random string is actually only asymptotically well defined, so the viewpoint that AIT does not give a formal definition of a random string is equally valid (and, among mathematicians I have spoken with, a common viewpoint). The issue is the lack of a canonical choice of a universal machine.

  • The definition of random infinite sequence has the same extent as the one given by Martin Lof, and this should be mentioned.

CMummert 04:37, 16 June 2006 (UTC)

[edit] Example strings

"The first string admits a short English language description namely "50 repetitions of '01'", which consists of 20 characters. The second one has no obvious simple description other than writing down the string itself, which has 100 characters." <----- This is not true. One could reference this string as eg. "the uncompressable 100 character string in http://en.wikipedia.org/wiki/Kolmogorov_complexity" which would have 95 characters.

What do you wiki-authors do with this now? I didn't dare to put it directly in the main page and wanted to spur some discussion. As far as I remember Martin Garder is talking about this paradox, but I cannot find the source right now. (Markus Gaelli)

This is just a red herring. The paradox is just because Kolmogorov complexity is only defined relative to a particular prefix free universal machine. There are plenty of machines that make the repeated 01 string have higher Kolmogorov complexity than the other string, by carefully choosing indices. So the correct reading of the intro paragraph is to view it as an informal motivating example, which is fine for an intro paragraph. Once the formal definition is given, it becomes clear that the intro paragraph is speaking of some intuitive idea of description which is made formal via universal machines. CMummert 23:57, 13 August 2006 (UTC)

[edit] Mandelbrot set

The picture is very nice, but I don't see what this has to do with Kolmogorov complexity, since the object in question is not encoded by a finite sequence in any obvious way. Could you provide a reference?--CSTAR 02:02, 24 October 2006 (UTC)

If you look at the article for Mandelbrot set, it should be clear that this is encodable by a short program. The information theory textbook:
  • Thomas M. Cover, Joy A. Thomas. Elements of information theory, 1st Edition. New York: Wiley-Interscience, 1991. ISBN 0-471-06259-6.
2nd Edition. New York: Wiley-Interscience, 2006. ISBN 0-471-24195-4.
uses such a picture for its front cover and makes a similar point. Calbaer 18:43, 24 October 2006 (UTC)
I think CSTAR's point is that Kolmogorov complexity is defined for finite strings, not for subsets of the plane. Talking about the Kolmogorov complexity of the Mandelbrot set is, on its face, a category error. The Kolmogorov complexity of a program that generates the Mandelbrot set is another matter. Though even there, the claim that the complexity is "close to zero" is confusing. Is a thousand close to zero? I doubt its complexity is less than that, using a standard definition of universal Turing machine.
Since the Mandelbrot set is closed, you can code it naturally by a single real. It's probably true that the initial segments of that real have bounded Kolmogorov complexity. Is that what you meant? --Trovatore 20:21, 4 November 2006 (UTC)
The question of what it means for a real to code the Mandelbrot set doesn't have a unique answer. The Mandelbrot set can be coded via a computable enumeration of all the basic open balls in its complement (the fact that there is a computable sequence to do this is not obvious but has been proved). The Kolmogorov complexity of initial segments of a real can never be bounded, but K(x \upharpoonright n) will be bounded by K(n) + C when x is a computable real. CMummert 19:40, 8 November 2006 (UTC)
Well, if you think it's confusing, it need not be there. However, I adjusted the caption to hopefully make it more understandable and accurate. By the way, back in the 80s, I made some programs to create 640x480 bitmaps of fractals; on a PC, they compiled to approximately 2500 bits. Compared to most image-creating programs, that's "almost zero," but if there's a better way to phrase the caption, I've no objection to its being rephrased. Calbaer 03:59, 5 November 2006 (UTC)
OK this assertion is quite different; you're talking about bitmaps of fixed size n of which there are 2n. Note also that in your caption, f really isn't a function on Cn, but a function on some floating point type, elements of which are also representable as finite bitstrings. To say "small" you would have to make an asymptotic assertion as n -> ∞ I suppose this is true, although I dont havce a reference or a proof.--CSTAR 17:20, 8 November 2006 (UTC)
The following question is open: whether for each n you can (in a uniform way) compute a bitmap with resolution 2 n of black, white, and undefined pixels such that any pixel entirely contained in the Mandelbrot set is black and any pixel entirely outside is white. This is hinted at in Mandelbrot set#Further results. There is a paper here [4] CMummert 19:40, 8 November 2006 (UTC)
The current caption is nice, but is inaccurate, since all versions of the picture are JPEGs. Where's this 17 KB PNG? Unless we replace the JPEGs with PNGs, the caption should be reverted. Calbaer 20:13, 6 December 2006 (UTC)
You're right - it's a JPEG. I fixed the caption. CMummert 20:24, 6 December 2006 (UTC)
I still find it problematic. The underlying picture itself is 189 KB, not 17 KB. Also, to create it would require a JPEG encoder in addition to the fractal generator. While I'm sure it could be generated in under 17 KB, this isn't obvious, and, moreover, gzip can reduce its size (by about 1.8%) so saying the complexity is merely "much less" is saying very little. (And, yes, I realize my initial caption had the same problem.) What would be great is if someone could contrast a significant difference between size of code with size of (losslessly compressed) file. (If someone has a DOS emulator handy, I can sent them my 80s-era Mandelbrot code to create a 640x480 image on screen.) But, as it is, things might be a bit confusing. Calbaer 21:12, 6 December 2006 (UTC)
I wouldn't mind it if the image were removed, and User:CSTAR has made similar comments. The new caption is an attempt to find an informally correct way to keep the image while relating it to the article material.
The fundamental question is whether it is possible to make a caption that is technically correct and relevant to the article. There are more serious technical issues with the current one than you mentioned, namely that Kolmogorov complexity is only defined up to a constant and so, relative to certain universal machines, the bitmap is a random string. The caption I wrote today was only intended to make the image fit into the context of the article - the previous caption gave little indication of why the image was there. Feel free to edit it. CMummert 21:21, 6 December 2006 (UTC)

[edit] Uncomputability of Kolmogorov complexity

It seems to me that the program GenerateParadoxicalString is completely unnecessary in the proof, as the desired contradiction is immediate from GenerateComplexString itself. All that's needed is to note the following contradictory consequences:

(1) GenerateComplexString, with input n, returns a string whose Kolmogorov complexity is at least n; consequently, for all n the combined length of GenerateComplexString and n, is at least n.

(2) The combined length of GenerateComplexString and n, is just a constant plus the length of a numeral for n; consequently, for sufficiently large n this combined length is strictly less than n. --r.e.s. 05:37, 11 February 2007 (UTC)

The reason for the GenerateParadoxicalString function is to make it clear that there really is a "sufficiently large" value of n that works. Also, you need to be careful in (1) because adding two strings together can give you a string with much lower Kolmogorov complexity than the sum of the complexities of the strings. CMummert · talk 13:25, 11 February 2007 (UTC)
Can't the point concerning "sufficiently large n" be made just as clearly using GenerateComplexString itself, rather than GenerateParadoxicalString? Also, doesn't the caution about Kolmogorov complexity when "adding" two strings apply with equal force to GenerateParadoxicalString? If so, the suggested change would yield a proof that's not only clearer and more direct, but also shorter by about half. --r.e.s. 14:24, 11 February 2007 (UTC)
Feel free to rewrite the proof; if there is something wrong with your revised version, either I or someone else will point it out. My comment above points out the two main areas where you need to take caution in your rewrite. If you want to be more cautious, you can put a new version here on the talk page for others to comment on. CMummert · talk 15:17, 11 February 2007 (UTC)
It doesn't make sense to rewrite it only to have the logic disputed afterwards, which is why I asked the two questions above (they were not merely rhetorical). I would prefer to discuss the basis before building on it. However, another fly in the ointment is the likelihood that someone will soon come along arguing that these "original" proofs don't belong in Wikipedia (and they'd be right, I guess, as I don't see any source cited for the present proof either), and all effort will have been wasted. Maybe the wiser course (for someone energetic) would be to locate an authoritative source and just cite it, paraphrasing the proof only if it's very short. --r.e.s. 16:36, 11 February 2007 (UTC)
Reply to r.e.s: Your proof sketch seems OK to me, although note that all the proofs in this page are informal, and when written down using a formalization of string descriptions (using Turing machines,) I'm not sure there is going to be any substantive difference in clarity or length. Also, for an expository and informal article, brevity of proofs is not necessarilly always better (although it usually is). --CSTAR 17:21, 11 February 2007 (UTC)
The proof in this article (which is really the same as what you sugggested above, just phrased differently) is not "original" in the WP sense; a proof of the result in question following the same lines could be found in Li and Vitanyi's book or Calude's book. There is broad consensus in mathematics articles that it is acceptable to include proofs of well-known results when those proofs are useful to explain what is going on. The proof we are discussing here is exactly that sort of proof. So there is no reason to remove the proof entirely.
The two bullets you gave above are not enough on their own to consitute a proof for an intended reader of this article, which is why I can't say what else is "needed" for an alternate proof. I just pointed out what the touchy points will be. CMummert · talk 18:45, 11 February 2007 (UTC)
I disagree that (1) and (2) are not enough, in that the proposed proof is at least as solid as the one in the article. Concerning your word of caution on the complexity of two "added" strings ... The two strings GenerateComplexString and its input n together constitute the program that's supposed to return a string with the property that all programs returning it have length at least n. (Taking the input as part of the program in this case is consistent with the way it's done more formally with TMs; e.g., as the article puts it, "if M is a TM which on input w outputs string x, then the concatenated string <M> w is a description of x".) Thus the inequality in (1) must hold by the very definition of Kolmogorov complexity. On the other hand, to question bullet (2) is to question an argument that's already used in the article, though applied there to the obfuscating program GenerateParadoxicalString.
However, I'll leave the suggested improvement to others if they wish to pursue it, as I doubt whether I have the energy to dispute the matter much further. --r.e.s. 02:04, 12 February 2007 (UTC)
The difficulty with the argument you just presented is: what is the value of n that actually leads to a contradiction? In order to turn an input into a program, you have to have a particular value of the input to encode (in binary, for example). The purpose of GenerateParadoxicalString is to obtain a bound that lets you choose a specific value of n. CMummert · talk 02:55, 12 February 2007 (UTC)
There is no difficulty, since all that's needed is that the inequality in (1) holds for all n, and the inequality in (2) holds for some n. Throughout, it's understood that some encoding has been chosen for all inputs n, and that it's the length of the encoding that's meant by "the length of n". At the same level of detail and/or informality as in the current proof, one can establish (1) and (2), and hence the desired contradiction -- with no need of GenerateParadoxicalString.--r.e.s. 12:29, 12 February 2007 (UTC)
{{sofixit}}. That's what I said above, as well. CMummert · talk 12:55, 12 February 2007 (UTC)
Thanks for the advice to feel free about doing a rewrite. At this point my choice is to decline.--r.e.s. 13:59, 12 February 2007 (UTC)