Talk:Quadratic sieve

From Wikipedia, the free encyclopedia

The paragraph named "Data processing" needs a serious rewrite. I just fixed an error that was introduced six(!) months ago, but even though it does not contain any more "logical errors", it does not show what happens at all. The 350^2 = (-90)^2 example is a very, very bad one, because actually 350 *is* -90 (mod 55). It only happens to prove the result because 350-90 = 2*350 has factor 5 common with 55, which itself is true only because 350 contains factor 5. This is nonsense! I really believe a less contrived example would be welcome, as it would decrease the probability of such errors and would help readers get a better grasp of the concept.

OK, I wrote a new example, with a bigger n and more strictly following the procedure outlined in the article (this example uses sieving). Decrypt3 23:44, Dec 13, 2004 (UTC)

Contents

[edit] New stuff

I tried to add a more approachable introduction to the ideas behind the algorithm, based roughly on the presentation from Prime Numbers: A Computational Perspective. Please scan my contributions for errors - number theory isn't quite my forte. Thanks. Deco 22:05, 8 October 2005 (UTC)


[edit] Corrections

I'm not familiar with how to edit these pages, but with the example of 1817, you should also produce the term (5,392) with 392=2^3*7^2 being smooth.

Also the line ...

  x^2\equiv 1817\pmod{p} 

should read (x+42)^2\equiv 1817\pmod{p}

[edit] Space complexity

I'm not yet sure whether the recent correction to the space complexity is correct or not. The optimal choice of the smoothness bound B is indeed the square root of the time complexity. On the other hand, if the bit matrix is represented explicitly, it requires B2 space. On the other other hand, we don't normally represent it explicitly - we use a sparse matrix representation that only stores a small number of factors per row, and reduce it using the Lanczos method for large sparse linear systems. We can't guarantee that the matrix will be sparse, but in practice it is. The cited reference referred to Number Field Sieve and so does not apply. Until someone finds an explicit reference describing the space complexity, I've removed any mention of it, since I'm uncertain myself. Deco 18:53, 26 November 2006 (UTC)

  • I searched high and low, but I couldn't find anything on the space complexity, either. Perhaps it's time to ask someone with more knowledge in number theory. On another note, I'd imagine that each main step of the quadratic sieve would have different complexities, with the sieving and matrix-solving steps being particularly large when compared to the other three. Personally, I think that there are still many things that can be added to the article. --Ixfd64 11:55, 27 November 2006 (UTC)
    • My computational number theory book (already referenced in the article) has the following to say in section 6.1.3:
      • There have been suggested at least three alternative sparse-matrix methods intended to replace Gaussian elimination [...] the space required is O(N ln B). Since our factorization matrices have at most O(ln n) nonzero entries per row, the space requirement for the matrix stage of the algorithm, using a sparse encoding, is O(B ln2 n).
    • If you put this together with the optimal choice of B being O\left(\exp\frac{1}{2}\left(\sqrt{\log n \log\log n}\right)\right), this means the space in the matrix stage is:
      • O\left(\log^2 n \exp\frac{1}{2}\left(\sqrt{\log n \log\log n}\right)\right)
    • On the other hand, my book also says that "there are hidden costs for increasing the list of primes p with which we sieve. One is that it is unlikely we can fit the entire sieve array into memory on a computer, so we segment it [...]". It seems difficult to establish which stage dominates in space, although in practice the space issues of the sieve are more easily overcome. As always, little is rigorous when it comes to QS. I suggest that we say that the matrix stage requires the amount of memory shown above, and leave it at that. Deco 12:21, 27 November 2006 (UTC)

[edit] To do

Some stuff that needs to be done in this article:

  • Explain how the asymptotically optimal value of the smoothness bound B is chosen.
  • Explain better the "sign" bit of the vectors and why it's needed.
  • Talk about multiple polynomial variations.
  • Talk about sparse matrix representations and solution methods and their applicability.
  • Talk about how large a region needs to be sieved.
  • Talk about the use of approximate logarithms to speed up calculations.
  • Similarities and differences with number field sieve.
  • Lots more stuff.

Just some things to keep in mind for future expansion. There is plenty to add here. Deco 01:13, 29 November 2006 (UTC)