michael@0: Squaring Algorithm michael@0: michael@0: When you are squaring a value, you can take advantage of the fact that michael@0: half the multiplications performed by the more general multiplication michael@0: algorithm (see 'mul.txt' for a description) are redundant when the michael@0: multiplicand equals the multiplier. michael@0: michael@0: In particular, the modified algorithm is: michael@0: michael@0: k = 0 michael@0: for j <- 0 to (#a - 1) michael@0: w = c[2*j] + (a[j] ^ 2); michael@0: k = w div R michael@0: michael@0: for i <- j+1 to (#a - 1) michael@0: w = (2 * a[j] * a[i]) + k + c[i+j] michael@0: c[i+j] = w mod R michael@0: k = w div R michael@0: endfor michael@0: c[i+j] = k; michael@0: k = 0; michael@0: endfor michael@0: michael@0: On the surface, this looks identical to the multiplication algorithm; michael@0: however, note the following differences: michael@0: michael@0: - precomputation of the leading term in the outer loop michael@0: michael@0: - i runs from j+1 instead of from zero michael@0: michael@0: - doubling of a[i] * a[j] in the inner product michael@0: michael@0: Unfortunately, the construction of the inner product is such that we michael@0: need more than two digits to represent the inner product, in some michael@0: cases. In a C implementation, this means that some gymnastics must be michael@0: performed in order to handle overflow, for which C has no direct michael@0: abstraction. We do this by observing the following: michael@0: michael@0: If we have multiplied a[i] and a[j], and the product is more than half michael@0: the maximum value expressible in two digits, then doubling this result michael@0: will overflow into a third digit. If this occurs, we take note of the michael@0: overflow, and double it anyway -- C integer arithmetic ignores michael@0: overflow, so the two digits we get back should still be valid, modulo michael@0: the overflow. michael@0: michael@0: Having doubled this value, we now have to add in the remainders and michael@0: the digits already computed by earlier steps. If we did not overflow michael@0: in the previous step, we might still cause an overflow here. That michael@0: will happen whenever the maximum value expressible in two digits, less michael@0: the amount we have to add, is greater than the result of the previous michael@0: step. Thus, the overflow computation is: michael@0: michael@0: michael@0: u = 0 michael@0: w = a[i] * a[j] michael@0: michael@0: if(w > (R - 1)/ 2) michael@0: u = 1; michael@0: michael@0: w = w * 2 michael@0: v = c[i + j] + k michael@0: michael@0: if(u == 0 && (R - 1 - v) < w) michael@0: u = 1 michael@0: michael@0: If there is an overflow, u will be 1, otherwise u will be 0. The rest michael@0: of the parameters are the same as they are in the above description. michael@0: michael@0: ------------------------------------------------------------------ michael@0: This Source Code Form is subject to the terms of the Mozilla Public michael@0: # License, v. 2.0. If a copy of the MPL was not distributed with this michael@0: # file, You can obtain one at http://mozilla.org/MPL/2.0/.