Table of Contents

PNC2 FAQ

I don't get the square root trick in relation to the sieve

It works in much the same application: allowing us to avoid unnecessary (or more specifically to the sieve, redundant) work.

Take 16 for example.

To find all the primes up to 16 (square root is 4), we'd do the following:

So now, when we read through the array to find the unmarked values, we find:

But, won't it miss something?

This would actually cover us until we start getting into uncovered multiples of higher primes (for instance, because we never processed multiples of 5, we'd hit an issue with some composite multiple of 5 that 2 and 3 did not also cover):

But, since we were ONLY checking up to 16, and 4 was its square root point, there is no need to worry about whether or not 25 is prime or composite, because it exceeds the range we were interested in.

Interestingly, 25 is also the square of 5, but that is less central to this than the property of the multiples, for if we were to continue:

So if we never covered multiples of 5, we'd start see these values ending in 5 start to fall through (55, 65 for instance).

Certainly not ALL values ending in 5 (5 is a prime, after all, and we have multiples of 3 that end in 5: 15, 45, 75). But we should see the pattern of what can slip through the cracks if it is never checked (so if it is needed to be checked, it absolutely needs to be checked).

The trick is in recognizing the value of the square root point in relation to the numbers involved. Because we're not going that high, we can utilize the square root point as a nice stopping point in our processing.

What about quantities, don't they screw with this scheme?

Yes, only because sieves need to know up front how much space to allocate (related to how many values to process). That is why we (somewhat grossly) overestimate (see the requisite section on the project page where we did a hack to allocate memory based on rough estimations of how many primes would be encountered).

In the end, though, it makes no difference: quantity or range, we have a means of knowing the (approximate with respect to quantity) amount of space needs and values to process. We'll admittedly take a bit of a performance hit because of eventual overestimations on quantities (as they get larger), unless you increased the resolution of estimation factors (beyond the x18, or the x6, x12, x18 example I showed on the project page).

But in the long run, performance on the sieve will be so staggering improved over primereg that even with a noted performance hit to accommodate quantity, it should still be drastically improved over the time-constrained algorithms.