security/nss/lib/freebl/ecl/README.FP

Wed, 31 Dec 2014 06:09:35 +0100

author
Michael Schloh von Bennewitz <michael@schloh.com>
date
Wed, 31 Dec 2014 06:09:35 +0100
changeset 0
6474c204b198
permissions
-rw-r--r--

Cloned upstream origin tor-browser at tor-browser-31.3.0esr-4.5-1-build1
revision ID fc1c9ff7c1b2defdbc039f12214767608f46423f for hacking purpose.

     1 This Source Code Form is subject to the terms of the Mozilla Public
     2 License, v. 2.0. If a copy of the MPL was not distributed with this
     3 file, You can obtain one at http://mozilla.org/MPL/2.0/.
     5 The ECL exposes routines for constructing and converting curve
     6 parameters for internal use.
     8 The floating point code of the ECL provides algorithms for performing
     9 elliptic-curve point multiplications in floating point.
    11 The point multiplication algorithms perform calculations almost
    12 exclusively in floating point for efficiency, but have the same
    13 (integer) interface as the ECL for compatibility and to be easily
    14 wired-in to the ECL.  Please see README file (not this README.FP file)
    15 for information on wiring-in.
    17 This has been implemented for 3 curves as specified in [1]:
    18 	secp160r1
    19 	secp192r1
    20 	secp224r1
    22 RATIONALE
    23 =========
    24 Calculations are done in the  floating-point unit (FPU) since it
    25 gives better performance on the UltraSPARC III chips.  This is
    26 because the FPU allows for faster multiplication than the integer unit.
    27 The integer unit has a longer multiplication instruction latency, and
    28 does not allow full pipelining, as described in [2].
    29 Since performance is an important selling feature of Elliptic Curve
    30 Cryptography (ECC), this implementation was created.
    32 DATA REPRESENTATION
    33 ===================
    34 Data is primarily represented in an array of double-precision floating
    35 point numbers.  Generally, each array element has 24 bits of precision
    36 (i.e. be x * 2^y, where x is an integer of at most 24 bits, y some positive
    37 integer), although the actual implementation details are more complicated.
    39 e.g. a way to store an 80 bit number might be:
    40 double p[4] = { 632613 * 2^0, 329841 * 2^24, 9961 * 2^48, 51 * 2^64 };
    41 See section ARITHMETIC OPERATIONS for more details.
    43 This implementation assumes that the floating-point unit rounding mode 
    44 is round-to-even as specified in IEEE 754
    45 (as opposed to chopping, rounding up, or rounding down).
    46 When subtracting integers represented as arrays of floating point 
    47 numbers, some coefficients (array elements) may become negative.
    48 This effectively gives an extra bit of precision that is important
    49 for correctness in some cases.
    51 The described number presentation limits the size of integers to 1023 bits.
    52 This is due to an upper bound of 1024 for the exponent of a double precision
    53 floating point number as specified in IEEE-754.
    54 However, this is acceptable for ECC key sizes of the foreseeable future.
    56 DATA STRUCTURES
    57 ===============
    58 For more information on coordinate representations, see [3].
    60 ecfp_aff_pt
    61 -----------
    62 Affine EC Point Representation.  This is the basic 
    63 representation (x, y) of an elliptic curve point.
    65 ecfp_jac_pt
    66 -----------
    67 Jacobian EC Point.  This stores a point as (X, Y, Z), where
    68 the affine point corresponds to (X/Z^2, Y/Z^3).  This allows
    69 for fewer inversions in calculations.
    71 ecfp_chud_pt
    72 ------------
    73 Chudnovsky Jacobian Point.  This representation stores a point
    74 as (X, Y, Z, Z^2, Z^3), the same as a Jacobian representation
    75 but also storing Z^2 and Z^3 for faster point additions.
    77 ecfp_jm_pt
    78 ----------
    79 Modified Jacobian Point.  This representation stores a point
    80 as (X, Y, Z, a*Z^4), the same as Jacobian representation but
    81 also storing a*Z^4 for faster point doublings.  Here "a" represents
    82 the linear coefficient of x defining the curve.
    84 EC_group_fp
    85 -----------
    86 Stores information on the elliptic curve group for floating
    87 point calculations.  Contains curve specific information, as
    88 well as function pointers to routines, allowing different
    89 optimizations to be easily wired in.
    90 This should be made accessible from an ECGroup for the floating
    91 point implementations of point multiplication.
    93 POINT MULTIPLICATION ALGORITHMS
    94 ===============================
    95 Elliptic Curve Point multiplication can be done at a higher level orthogonal
    96 to the implementation of point additions and point doublings.  There
    97 are a variety of algorithms that can be used.
    99 The following algorithms have been implemented:
   101 4-bit Window (Jacobian Coordinates)
   102 Double & Add (Jacobian & Affine Coordinates)
   103 5-bit Non-Adjacent Form (Modified Jacobian & Chudnovsky Jacobian)
   105 Currently, the fastest algorithm for multiplying a generic point
   106 is the 5-bit Non-Adjacent Form.
   108 See comments in ecp_fp.c for more details and references.
   110 SOURCE / HEADER FILES
   111 =====================
   113 ecp_fp.c
   114 --------
   115 Main source file for floating point calculations.  Contains routines
   116 to convert from floating-point to integer (mp_int format), point
   117 multiplication algorithms, and several other routines.
   119 ecp_fp.h
   120 --------
   121 Main header file.  Contains most constants used and function prototypes.
   123 ecp_fp[160, 192, 224].c
   124 -----------------------
   125 Source files for specific curves.  Contains curve specific code such
   126 as specialized reduction based on the field defining prime.  Contains
   127 code wiring-in different algorithms and optimizations.
   129 ecp_fpinc.c
   130 -----------
   131 Source file that is included by ecp_fp[160, 192, 224].c.  This generates
   132 functions with different preprocessor-defined names and loop iterations,
   133 allowing for static linking and strong compiler optimizations without
   134 code duplication.
   136 TESTING
   137 =======
   138 The test suite can be found in ecl/tests/ecp_fpt.  This tests and gets
   139 timings of the different algorithms for the curves implemented.
   141 ARITHMETIC OPERATIONS
   142 ---------------------
   143 The primary operations in ECC over the prime fields are modular arithmetic:
   144 i.e. n * m (mod p) and n + m (mod p).  In this implementation, multiplication,
   145 addition, and reduction are implemented as separate functions.  This
   146 enables computation of formulae with fewer reductions, e.g.
   147 (a * b) + (c * d) (mod p) rather than:
   148 ((a * b) (mod p)) + ((c * d) (mod p)) (mod p)
   149 This takes advantage of the fact that the double precision mantissa in
   150 floating point can hold numbers up to 2^53, i.e. it has some leeway to
   151 store larger intermediate numbers.  See further detail in the section on 
   152 FLOATING POINT PRECISION.
   154 Multiplication
   155 --------------
   156 Multiplication is implemented in a standard polynomial multiplication
   157 fashion.  The terms in opposite factors are pairwise multiplied and
   158 added together appropriately.  Note that the result requires twice 
   159 as many doubles for storage, as the bit size of the product is twice
   160 that of the multiplicands.
   161 e.g. suppose we have double n[3], m[3], r[6], and want to calculate r = n * m
   162 r[0] = n[0] * m[0]
   163 r[1] = n[0] * m[1] + n[1] * m[0]
   164 r[2] = n[0] * m[2] + n[1] * m[1] + n[2] * m[0]
   165 r[3] = n[1] * m[2] + n[2] * m[1]
   166 r[4] = n[2] * m[2]
   167 r[5] = 0 (This is used later to hold spillover from r[4], see tidying in
   168 the reduction section.)
   170 Addition
   171 --------
   172 Addition is done term by term.  The only caveat is to be careful with
   173 the number of terms that need to be added.  When adding results of
   174 multiplication (before reduction), twice as many terms need to be added
   175 together.  This is done in the addLong function.
   176 e.g. for double n[4], m[4], r[4]:  r = n + m
   177 r[0] = n[0] + m[0]
   178 r[1] = n[1] + m[1]
   179 r[2] = n[2] + m[2]
   180 r[3] = n[3] + m[3]
   182 Modular Reduction
   183 -----------------
   184 For the curves implemented, reduction is possible by fast reduction
   185 for Generalized Mersenne Primes, as described in [4].  For the 
   186 floating point implementation, a significant step of the reduction 
   187 process is tidying:  that is, the propagation of carry bits from 
   188 low-order to high-order coefficients to reduce the precision of each 
   189 coefficient to 24 bits.
   190 This is done by adding and then subtracting
   191 ecfp_alpha, a large floating point number that induces precision roundoff.
   192 See [5] for more details on tidying using floating point arithmetic.
   193 e.g. suppose we have r = 961838 * 2^24 + 519308
   194 then if we set alpha = 3 * 2^51 * 2^24,
   195 FP(FP(r + alpha) - alpha) = 961838 * 2^24, because the precision for
   196 the intermediate results is limited.  Our values of alpha are chosen
   197 to truncate to a desired number of bits.
   199 The reduction is then performed as in [4], adding multiples of prime p.
   200 e.g. suppose we are working over a polynomial of 10^2.  Take the number
   201 2 * 10^8 + 11 * 10^6 + 53 * 10^4 + 23 * 10^2 + 95, stored in 5 elements
   202 for coefficients of 10^0, 10^2, ..., 10^8.
   203 We wish to reduce modulo p = 10^6 - 2 * 10^4 + 1
   204 We can subtract off from the higher terms
   205 (2 * 10^8 + 11 * 10^6 + 53 * 10^4 + 23 * 10^2 + 95) - (2 * 10^2) * (10^6 - 2 * 10^4 + 1)
   206 = 15 * 10^6 + 53 * 10^4 + 21 * 10^2 + 95
   207 = 15 * 10^6 + 53 * 10^4 + 21 * 10^2 + 95 - (15) * (10^6 - 2 * 10^4 + 1)
   208 = 83 * 10^4 + 21 * 10^2 + 80
   210 Integrated Example
   211 ------------------
   212 This example shows how multiplication, addition, tidying, and reduction
   213 work together in our modular arithmetic.  This is simplified from the
   214 actual implementation, but should convey the main concepts.  
   215 Working over polynomials of 10^2 and with p as in the prior example,
   216 Let a = 16 * 10^4 + 53 * 10^2 + 33
   217 let b = 81 * 10^4 + 31 * 10^2 + 49
   218 let c = 22 * 10^4 +  0 * 10^2 + 95
   219 And suppose we want to compute a * b + c mod p.  
   220 We first do a multiplication:  then a * b =
   221 0 * 10^10 + 1296 * 10^8 + 4789 * 10^6 + 5100 * 10^4 + 3620 * 10^2 + 1617
   222 Then we add in c before doing reduction, allowing us to get a * b + c =
   223 0 * 10^10 + 1296 * 10^8 + 4789 * 10^6 + 5122 * 10^4 + 3620 * 10^2 + 1712
   224 We then perform a tidying on the upper half of the terms:
   225 0 * 10^10 + 1296 * 10^8 + 4789 * 10^6
   226 0 * 10^10 + (1296 + 47) * 10^8 + 89 * 10^6
   227 0 * 10^10 + 1343 * 10^8 + 89 * 10^6
   228 13 * 10^10 + 43 * 10^8 + 89 * 10^6
   229 which then gives us
   230 13 * 10^10 + 43 * 10^8 + 89 * 10^6 + 5122 * 10^4 + 3620 * 10^2 + 1712
   231 we then reduce modulo p similar to the reduction example above:
   232 13 * 10^10 + 43 * 10^8 + 89 * 10^6 + 5122 * 10^4 + 3620 * 10^2 + 1712 
   233 	- (13 * 10^4 * p)
   234 69 * 10^8 + 89 * 10^6 + 5109 * 10^4 + 3620 * 10^2 + 1712
   235 	- (69 * 10^2 * p)
   236 227 * 10^6 + 5109 * 10^4 + 3551 * 10^2 + 1712
   237 	- (227 * p)
   238 5563 * 10^4 + 3551 * 10^2 + 1485
   239 finally, we do tidying to get the precision of each term down to 2 digits
   240 5563 * 10^4 + 3565 * 10^2 + 85
   241 5598 * 10^4 + 65 * 10^2 + 85
   242 55 * 10^6 + 98 * 10^4 + 65 * 10^2 + 85
   243 and perform another reduction step
   244 	- (55 * p)
   245 208 * 10^4 + 65 * 10^2 + 30
   246 There may be a small number of further reductions that could be done at
   247 this point, but this is typically done only at the end when converting
   248 from floating point  to an integer unit representation.
   250 FLOATING POINT PRECISION
   251 ========================
   252 This section discusses the precision of floating point numbers, which
   253 one writing new formulae or a larger bit size should be aware of.  The
   254 danger is that an intermediate result may be required to store a
   255 mantissa larger than 53 bits, which would cause error by rounding off.
   257 Note that the tidying with IEEE rounding mode set to round-to-even
   258 allows negative numbers, which actually reduces the size of the double
   259 mantissa to 23 bits - since it rounds the mantissa to the nearest number 
   260 modulo 2^24, i.e. roughly between -2^23 and 2^23.
   261 A multiplication increases the bit size to 2^46 * n, where n is the number
   262 of doubles to store a number.  For the 224 bit curve, n = 10.  This gives 
   263 doubles of size 5 * 2^47.  Adding two of these doubles gives a result
   264 of size 5 * 2^48, which is less than 2^53, so this is safe.  
   265 Similar analysis can be done for other formulae to ensure numbers remain
   266 below 2^53.
   268 Extended-Precision Floating Point
   269 ---------------------------------
   270 Some platforms, notably x86 Linux, may use an extended-precision floating
   271 point representation that has a 64-bit mantissa.  [6]  Although this
   272 implementation is optimized for the IEEE standard 53-bit mantissa,
   273 it should work with the 64-bit mantissa.  A check is done at run-time
   274 in the function ec_set_fp_precision that detects if the precision is
   275 greater than 53 bits, and runs code for the 64-bit mantissa accordingly.
   277 REFERENCES
   278 ==========
   279 [1] Certicom Corp., "SEC 2: Recommended Elliptic Curve Domain Parameters", Sept. 20, 2000.  www.secg.org
   280 [2] Sun Microsystems Inc.  UltraSPARC III Cu User's Manual, Version 1.0, May 2002, Table 4.4
   281 [3] H. Cohen, A. Miyaji, and T. Ono, "Efficient Elliptic Curve Exponentiation Using Mixed Coordinates".
   282 [4] Henk C.A. van Tilborg, Generalized Mersenne Prime.  http://www.win.tue.nl/~henkvt/GenMersenne.pdf
   283 [5] Daniel J. Bernstein, Floating-Point Arithmetic and Message Authentication, Journal of Cryptology, March 2000, Section 2.
   284 [6] Daniel J. Bernstein, Floating-Point Arithmetic and Message Authentication, Journal of Cryptology, March 2000, Section 2 Notes.

mercurial