Comparing sqrt_ratio
function implementations
Comparison of performance of
sqrt_ratio
implementations. The new implementation (
new_method
below) merges the exponentiations in the Tonelli–Shanks algorithm with those in the inversion (as described in
Scott's paper). Thus 3 exponentiations (1 for each squareroot and 1 for the inversion) become 1 (computing the "projenitor" for
num^3*den
). When
p - 1 is
2-adically small one expects this to perform much better. On the other hand when
p - 1 = 2s t with
t small, most of the time is spent on the loop and little is saved. The worst case of this is the Fermat primes, the best case is
p = 3 (mod
4), but performance is pretty good when
t is smallish.
Each instance consists of computing
sqrt_ratio(&num, &den)
for 1000 random pairs
(num, den)
(so that the average 6.75
ms of
new_method
for
2255 - 19 equates to average 6.75
μs per instance). This was on my machine, your milage may vary.
See individual benchmark pages below for more details.
The favourable cases
The unfavourable cases