Talk:Miller-Rabin primality test

From Wikipedia, the free encyclopedia

This article is within the scope of WikiProject Computer science, which aims to create a comprehensive computer science reference for Wikipedia. Visit the project page for more information and to join in on related discussions.
??? This article has not yet received a rating on the assessment scale.
??? This article has not yet received an importance rating on the assessment scale.

To 203.217.64.49:

  • Please direct questions and other discussion on the talk page, not in the article text.
  • Any nonzero integer is uniquely expressible as an odd multiple of a power of 2. This defines k and m from n. In other words, k is the largest integer such that 2k divides n−1.

-- EJ 11:49, 21 Sep 2004 (UTC)

Don't you get some really large numbers when doing this? I'm trying to write a C++ implementation using 1009, which is prime, as a test number, and I just realized that in larger bases you get very big numbers in the second test. For example, 1009 = 2^4 \times 63, so for the second test you might have 2^{2^363} = 2^{504}

I haven't stepped through it, but as with all crypto stuff involving modular exponentiation, you can keep reducing it at every step. See modular exponentiation. CryptoDerk July 8, 2005 21:25 (UTC)

Contents

[edit] Ruby Code

Wouldn't it be much (well, at least a little, compared to the overall complexity) more efficient to store the value of s instead of shifting until we reach n-1 in the example Ruby code? I do not know Ruby, but it is safe to say that shifting the same values repeatedly is less efficient than storing the value of s in the calculation. Coolv(TalkContribs) 19:11, 15 June 2007 (UTC)

I'd expect that your proposal would have a very small impact on the runtime, because shift operations are cheap compared to modular multiplication. But your proposal would possibly make the code easier to understand. E.g., it seems that even the author of the code was a little confused. The while loop computes all powers a^{2^r d} for all r\leq s while the Miller-Rabin test only calls for the cases with r\leq s-1. I'd further propose the following modifications: (1) the functions should not be called prime, since it is only a pseudoprimality test; (2) negative integers should not be declarde prime. 85.2.99.96 16:16, 4 July 2007 (UTC)

[edit] Smallest number for which MR test is uncertain?

Hello. It seems one could establish by brute force that the MR test gives exactly the right answer for all numbers up to some limit (determined by how long you run the MR test on successive integers). Has anyone done a brute force search like that? Or maybe someone has proved that MR test is correct (i.e., reports N is prime iff N is actually prime) for all numbers up to some bound? -- I'm assuming here that the test is computationally deterministic, i.e., if I run the test twice on the same input, I'll always get the same report. Maybe my understanding of the test is flawed. It seems like the "probabilistic" aspect of the test could be explained at greater length. Many thanks to the authors for their work on this article. 207.174.201.18 18:47, 1 February 2006 (UTC)

You apparently misunderstood something. Probabilistic tests are explained at greater length in Primality test and Randomized algorithm, which are both prominently linked from the first paragraph of the article.
It would help this article to briefly state what is the import of the fact that Rabin's version is probabilistic. Given the subtlety of the concept, it's not enough to give a link and assume the readers can go figure it out for themselves. 207.174.201.18 00:43, 2 February 2006 (UTC)
Well, people certainly can go follow the link, and at some point it becomes necessary to assume they do, because it is impossible to repeat everything everywhere. But I'll try to put some summary here, you're right that the concept is confusing. -- EJ 20:16, 3 February 2006 (UTC)
The probabilistic (Rabin's) version of the algorithm, which is discussed in the article, is provably correct for all composites. It is not deterministic, which is (1) the whole point of the definition of a probabilistic test, and (2) obvious from the description of the algorithm: "pick a randomly in the range [1, n − 1]". I have removed the irrelevant, and wrong as stated, sentence from the "Additional information" section, which might have confused you.
What can be said about Rabin's test if it is implemented on a computer which has deterministic hardware? (I.e., assuming random numbers come from a pseudorandom number generator and not from atomic decay or some other "really random" source.) 207.174.201.18 00:43, 2 February 2006 (UTC)
That's a very good question, but I'm afraid that nobody knows a satisfactory answer. Derandomization and pseudorandom generators are a big topic in theoretical computer science. However, these constructions of PRG tend to be inefficient (and thus rarely used in practice), and they depend on unproven assumptions about computational hardness of certain functions. PRG used in practice are based on heuristic without any serious analysis. -- EJ 20:16, 3 February 2006 (UTC)
The deterministic (Miller's) version of the algorithm, which is only briefly mentioned in the introduction, is only conjectured to be correct for all composites. As for brute force verification, it is known e.g. that all numbers n < 341,550,071,728,321 which pass the test for bases a = 2, 3, 5, 7, 11, 13, and 17, are prime [1]. As 17 is well below 2(ln n)2, Miller's test is also correct for these n. The Riemann hypothesis was also verified by brute force for a lot of numbers, but I don't know whether anybody tried similarly to verify the Generalized Riemann Hypothesis for quadratic Dirichlet characters, which is what the Miller test depends on. -- EJ 20:22, 1 February 2006 (UTC)
OK, thanks a lot for these facts. If I understand correctly, the outcome of the MR test is uncertain only if n >= 341,550,071,728,321 (because for numbers less than that, the test reports a number may be prime iff it is indeed prime). I think this aspect should be mentioned in the article itself. 207.174.201.18 00:43, 2 February 2006 (UTC)
Bear in mind that these results only concern the deterministic version of the algorithm, they are irrelevant for the randomized version. Also, do not take the bound too seriously, that fact that I could quickly find it does not mean that no better bounds are available.
The current (almost nonexistent) coverage of the deterministic algorithm in the article should be expanded, thanks for the tip. -- EJ 20:16, 3 February 2006 (UTC)
The ANTS'94 paper by Alford, Granville and Pomerance contains some conjectures about the necessary number of tests needed for a deterministic version. I.e., the expect that (1 + o(1))(log(n)loglog(n)) tests is sufficient, whereas o(log(n)) is not sufficient. 24.228.93.22 05:07, 5 February 2006 (UTC)
Thanks for the pointer. -- EJ 17:30, 11 February 2006 (UTC)
Bleichenbacher [2] showed that Miller's test holds for n < 1×1016. That's the best result I've seen; I'd love to know of a better result. Sloan's OEIS doesn't have much information about SPSPs above 1×1013 and nothing above 1×1016 as I recall. CRGreathouse (talkcontribs) 02:38, 6 August 2006 (UTC)

[edit] Some imprecise (wrong) formulations ?

After reading the article I have the suspicion that some of the formulations are very imprecise (or plain wrong :-)):

"Strong Witness": If I understand this correctly, it means that we found a number x for which the following equations hold:


x \equiv a^{2^r\cdot d} \pmod{n} for some 0 \le r \le s-1

x \not\equiv -1\pmod{n}

x \not\equiv  1\pmod{n}

x^2 \equiv 1 \pmod{n}

To sum up: x is a non-trivial square-root of 1(mod n)

Are there any non-trivial square roots of 1, if n is a prime ? If there are not, then I assume that "Strong Witness" would be better called "Sure Witness" for the compositeness of n. Fruther then at the beginning "Let n be an odd prime, ... Then, one of the following must be true for some a" it should be "... Then, one of the following must be true for EVERY a ..."

Is the above correct ?

213.70.137.66 17:49, 14 February 2006 (UTC)

You're right that there aren't non-trivial square roots of 1 mod prime p. If x2 = 1 (mod p), then (x-1)(x+1) = 0 (mod p). If x is neither 1 nor -1 mod p, neither x-1 or x+1 is zero mod p, which means we have a nontrivial factorization of p, which is of course impossible. This is how we can claim that taking square roots repeatedly will either always yield 1 or eventually yield -1. In fact, as the pseudocode shows, any failure of the conditions proves compositeness. I've modified the article to read witness of compositeness rather than strong witness. You're also correct that it should say "for all a"; it is for composite numbers that it is only true for some a. Deco 18:24, 14 February 2006 (UTC)
I've added the lemma about non-trivial square roots of 1 to the page. It's fairly obvious, but it doesn't hurt to spell out everything. Deco 18:53, 14 February 2006 (UTC)

[edit] Deterministic variants of the test

82.6.98.77, the result has been published in a respectable peer-reviewed journal, and as it is referenced by several other sources, it seems to be accepted by the community. If you disagree with the result, you cannot just slap a "this is wrong" notice, you have to bring forth convincing evidence. -- EJ 20:18, 27 April 2006 (UTC)

Counter example: Take n=7. So n-1 = 6 and s=1,d=3 since 2^1.3 = 6. The article claims it's enough to check a=2,7 and 61. Try a=7. The test described in "deterministic variants" section then says n is composite. -- XYZ, June 8 2006

Could it be that only a in the range 1...(n-1) should be tested? If so, it's not clear from the page. -- XYZ, June 8 20006.

Yes, only a in the interval [1,n-1] should be checked. -- EJ 19:17, 8 June 2006 (UTC)

Yeah, sorry for my stupid comment. EJ is correct, a must always be smaller than n. In fact, we specify that a\in \left(\mathbb{Z}/n\mathbb{Z}\right)^*, which excludes the possibility that a = 0(modn). I assume the result of Jaeschke takes this into account - not to say this doesn't deserve clarification. Deco 19:28, 8 June 2006 (UTC)

[edit] Strength

I added the word in bold: "The Miller-Rabin test is strictly stronger than the Solovay-Strassen primality test in the sense the set of strong liars of the Miller-Rabin test is a proper subset of the set of the Solovay-Strassen primality test."

It was removed, which is fine with me. (It was actually changed to "(often proper)", which seems overly pedantic to me, but that's beside the point.) Thinking of the set of all {M-R, S-S} liars grouped by base the inclusion is proper; thinking of a particular base the result isn't proper (but is still inclusion).

However, it we're not going to call the subset proper, we shouldn't use the term "strictly stronger"; it should be stronger. I propose one of the two wordings below (differences bolded):

"The Miller-Rabin test is strictly stronger than the Solovay-Strassen primality test in the sense the set of strong liars of the Miller-Rabin test is a proper subset of the set of the Solovay-Strassen primality test."
"The Miller-Rabin test is stronger than the Solovay-Strassen primality test in the sense the set of strong liars of the Miller-Rabin test is a subset of the set of the Solovay-Strassen primality test."

If there's no feedback, I'm going to make one of these changes in a few days. CRGreathouse (talkcontribs) 17:01, 5 August 2006 (UTC)

The M-R test is an algorithm, which gets an integer n and a randomly chosen integer a, and tries to use the latter to determine the compositeness of the former. It is strictly stronger than S-S because it is correct whenever S-S is correct, and moreover, there are n and a for which S-S is incorrect, but M-R is correct.
The "set of strong liars", on the other hand, has no absolute meaning. It only makes sense if we fix n, albeit implicitly (as in the offending sentence, in the current version and all versions you proposed). For some n it is a proper subset of Euler liars, for some n it equals the set of Euler liars. We cannot call it proper without further qualification, because that would read as "for every n, the set of strong liars is a proper subset of Euler liars", which is false.
IMO both the original wording and the current wording are OK, in the sense that they are not incorrect, and anyone who understands the concepts will figure out what it exactly means. I disagree with your suggestions, because the first one is incorrect, and the second one fails to mention that M-R is strictly stronger than S-S, which is the whole point of the sentence. If you insist on changing the formulation, the proper way would be
"The Miller-Rabin test is strictly stronger than the Solovay-Strassen primality test in the sense that for every composite n, the set of strong liars for n is a subset of the set of Euler liars for n, and for many n, the subset is proper."
but I think that this is too verbose. -- EJ 18:48, 5 August 2006 (UTC)
I think that's fine actually and not too verbose. Deco 01:43, 6 August 2006 (UTC)
OK. CRGreathouse apparently does not object, so I'll put that in. -- EJ 19:42, 13 August 2006 (UTC)

[edit] Broken Algorithm?

We know 5 is a prime. So s=2, d=1 and n=5. Let's suppose a=4 (it's in the range 1-4) and r=1, since the case r=0 supports the primality. Then we have (1) 4^1 mod 5 = 4, so it is != 1

(2) 4^(2^1 *1) mod 5 = 1, so it is != -1 or 4. Therefore, 5 isn't a prime? —The preceding unsigned comment was added by 84.159.42.147 (talk • contribs) 23:34, 14 July 2007 (UTC)

For all r means for both r = 0 and r = 1. As you say, taking r = 0 already "supports the primality", hence whatever happens with r = 1 is irrelevant. -- EJ 08:40, 15 July 2007 (UTC)

[edit] Generating coprime a

In an edit summary, User:EJ said: "The algorithm takes a in full Z/nZ (save the obvious exception a=0), because there is no simple way of generating a's coprime to n."

He's correct that the algorithm as it's normally formulated does not restrict a to a value coprime to n, but I think there is a simple way of generating such a: generate a random a, test GCD with n, and if it exceeds 1 return COMPOSITE, else continue. This isn't needed for correctness though, and provided the value n has (as is typical) been prescreened for small divisors, the chance of hitting a value sharing factors with n is so small as to make this not a useful optimization. Dcoetzee 19:57, 5 November 2007 (UTC)

You're right that my edit summary was misleading. The actual reason why the algorithm does not restrict a to values coprime to n is that it is a pointless complication: the other a are all compositeness witnesses anyway (they do not need to be explicitly tested for, as the condition an − 1 = 1(mod n) implies that a is coprime to n), and in fact, provide a factorization of n. And, as you say, they are too rare to make worrying about them worthwhile (which is, after all, one of the reasons why factoring is conjectured to be hard). -- EJ 10:44, 6 November 2007 (UTC)
To make the last sentence less confusing: what I was trying to point out was that taking gcd with random numbers is a rather inefficient way of finding factors, it is better to try small divisors in order (if there is a realistic chance that n might have small factors at all). Anyway, forget about factoring, it's irrelevant to Rabin-Miller. -- EJ 11:32, 7 November 2007 (UTC)
Yup. Dcoetzee 11:36, 7 November 2007 (UTC)

[edit] Haskell code

Why Haskell implementation of the algorithm was removed? It's not true that given pseudocode is sufficient to directly write efficient code on a computer language. And removed Haskell code was not straightforward translation of the pseudocode either. So, I think, we must return Haskell sample or write another pseudocode, better suited for an implementation. —Preceding unsigned comment added by 87.255.1.40 (talk) 17:06, 10 April 2008 (UTC)

Wikipedia is not a source code repository. The article is supposed to describe the algorithm so that the reader can understand why it works, it is not supposed to give optimized implementations ready for copy&paste. Furthermore, what is or is not efficient heavily depends on the language used, as well as other circumstances (typical size and structure of input data, hardware parameters, etc.) Thus anybody wishing to write an efficient implementation would still have to do it on his/her own, not by copying a random Haskell implementation, especially since he/she is likely to use a completely different language. Finally, the Wikipedia verifiability policy applies to this situation as usual: unless you can supply a reference to published reliable source, your Haskell implementation counts as original research, which is not permitted on WP. You know, why should we take your word on the correctness or efficiency of your code?
If you are just looking for a public place where to post your code, I'm sure you'll be able to find a plenty of internet sites suitable for that purpose. Wikipedia is not one of them. — EJ (talk) 14:48, 14 April 2008 (UTC)