Wed, 31 Dec 2014 06:09:35 +0100
Cloned upstream origin tor-browser at tor-browser-31.3.0esr-4.5-1-build1
revision ID fc1c9ff7c1b2defdbc039f12214767608f46423f for hacking purpose.
michael@0 | 1 | Squaring Algorithm |
michael@0 | 2 | |
michael@0 | 3 | When you are squaring a value, you can take advantage of the fact that |
michael@0 | 4 | half the multiplications performed by the more general multiplication |
michael@0 | 5 | algorithm (see 'mul.txt' for a description) are redundant when the |
michael@0 | 6 | multiplicand equals the multiplier. |
michael@0 | 7 | |
michael@0 | 8 | In particular, the modified algorithm is: |
michael@0 | 9 | |
michael@0 | 10 | k = 0 |
michael@0 | 11 | for j <- 0 to (#a - 1) |
michael@0 | 12 | w = c[2*j] + (a[j] ^ 2); |
michael@0 | 13 | k = w div R |
michael@0 | 14 | |
michael@0 | 15 | for i <- j+1 to (#a - 1) |
michael@0 | 16 | w = (2 * a[j] * a[i]) + k + c[i+j] |
michael@0 | 17 | c[i+j] = w mod R |
michael@0 | 18 | k = w div R |
michael@0 | 19 | endfor |
michael@0 | 20 | c[i+j] = k; |
michael@0 | 21 | k = 0; |
michael@0 | 22 | endfor |
michael@0 | 23 | |
michael@0 | 24 | On the surface, this looks identical to the multiplication algorithm; |
michael@0 | 25 | however, note the following differences: |
michael@0 | 26 | |
michael@0 | 27 | - precomputation of the leading term in the outer loop |
michael@0 | 28 | |
michael@0 | 29 | - i runs from j+1 instead of from zero |
michael@0 | 30 | |
michael@0 | 31 | - doubling of a[i] * a[j] in the inner product |
michael@0 | 32 | |
michael@0 | 33 | Unfortunately, the construction of the inner product is such that we |
michael@0 | 34 | need more than two digits to represent the inner product, in some |
michael@0 | 35 | cases. In a C implementation, this means that some gymnastics must be |
michael@0 | 36 | performed in order to handle overflow, for which C has no direct |
michael@0 | 37 | abstraction. We do this by observing the following: |
michael@0 | 38 | |
michael@0 | 39 | If we have multiplied a[i] and a[j], and the product is more than half |
michael@0 | 40 | the maximum value expressible in two digits, then doubling this result |
michael@0 | 41 | will overflow into a third digit. If this occurs, we take note of the |
michael@0 | 42 | overflow, and double it anyway -- C integer arithmetic ignores |
michael@0 | 43 | overflow, so the two digits we get back should still be valid, modulo |
michael@0 | 44 | the overflow. |
michael@0 | 45 | |
michael@0 | 46 | Having doubled this value, we now have to add in the remainders and |
michael@0 | 47 | the digits already computed by earlier steps. If we did not overflow |
michael@0 | 48 | in the previous step, we might still cause an overflow here. That |
michael@0 | 49 | will happen whenever the maximum value expressible in two digits, less |
michael@0 | 50 | the amount we have to add, is greater than the result of the previous |
michael@0 | 51 | step. Thus, the overflow computation is: |
michael@0 | 52 | |
michael@0 | 53 | |
michael@0 | 54 | u = 0 |
michael@0 | 55 | w = a[i] * a[j] |
michael@0 | 56 | |
michael@0 | 57 | if(w > (R - 1)/ 2) |
michael@0 | 58 | u = 1; |
michael@0 | 59 | |
michael@0 | 60 | w = w * 2 |
michael@0 | 61 | v = c[i + j] + k |
michael@0 | 62 | |
michael@0 | 63 | if(u == 0 && (R - 1 - v) < w) |
michael@0 | 64 | u = 1 |
michael@0 | 65 | |
michael@0 | 66 | If there is an overflow, u will be 1, otherwise u will be 0. The rest |
michael@0 | 67 | of the parameters are the same as they are in the above description. |
michael@0 | 68 | |
michael@0 | 69 | ------------------------------------------------------------------ |
michael@0 | 70 | This Source Code Form is subject to the terms of the Mozilla Public |
michael@0 | 71 | # License, v. 2.0. If a copy of the MPL was not distributed with this |
michael@0 | 72 | # file, You can obtain one at http://mozilla.org/MPL/2.0/. |