Thu, 22 Jan 2015 13:21:57 +0100
Incorporate requested changes from Mozilla in review:
https://bugzilla.mozilla.org/show_bug.cgi?id=1123480#c6
michael@0 | 1 | MPI Library Timing Tests |
michael@0 | 2 | |
michael@0 | 3 | Hardware/OS |
michael@0 | 4 | (A) SGI O2 1 x MIPS R10000 250MHz IRIX 6.5.3 |
michael@0 | 5 | (B) IBM RS/6000 43P-240 1 x PowerPC 603e 223MHz AIX 4.3 |
michael@0 | 6 | (C) Dell GX1/L+ 1 x Pentium III 550MHz Linux 2.2.12-20 |
michael@0 | 7 | (D) PowerBook G3 1 x PowerPC 750 266MHz LinuxPPC 2.2.6-15apmac |
michael@0 | 8 | (E) PowerBook G3 1 x PowerPC 750 266MHz MacOS 8.5.1 |
michael@0 | 9 | (F) PowerBook G3 1 x PowerPC 750 400MHz MacOS 9.0.2 |
michael@0 | 10 | |
michael@0 | 11 | Compiler |
michael@0 | 12 | (1) MIPSpro C 7.2.1 -O3 optimizations |
michael@0 | 13 | (2) GCC 2.95.1 -O3 optimizations |
michael@0 | 14 | (3) IBM AIX xlc -O3 optimizations (version unknown) |
michael@0 | 15 | (4) EGCS 2.91.66 -O3 optimizations |
michael@0 | 16 | (5) Metrowerks CodeWarrior 5.0 C, all optimizations |
michael@0 | 17 | (6) MIPSpro C 7.30 -O3 optimizations |
michael@0 | 18 | (7) same as (6), with optimized libmalloc.so |
michael@0 | 19 | |
michael@0 | 20 | Timings are given in seconds, computed using the C library's clock() |
michael@0 | 21 | function. The first column gives the hardware and compiler |
michael@0 | 22 | configuration used for the test. The second column indicates the |
michael@0 | 23 | number of tests that were aggregated to get the statistics for that |
michael@0 | 24 | size. These were compiled using 16 bit digits. |
michael@0 | 25 | |
michael@0 | 26 | Source data were generated randomly using a fixed seed, so they should |
michael@0 | 27 | be internally consistent, but may vary on different systems depending |
michael@0 | 28 | on the C library. Also, since the resolution of the timer accessed by |
michael@0 | 29 | clock() varies, there may be some variance in the precision of these |
michael@0 | 30 | measurements. |
michael@0 | 31 | |
michael@0 | 32 | Prime Generation (primegen) |
michael@0 | 33 | |
michael@0 | 34 | 128 bits: |
michael@0 | 35 | A1 200 min=0.03, avg=0.19, max=0.72, sum=38.46 |
michael@0 | 36 | A2 200 min=0.02, avg=0.16, max=0.62, sum=32.55 |
michael@0 | 37 | B3 200 min=0.01, avg=0.07, max=0.22, sum=13.29 |
michael@0 | 38 | C4 200 min=0.00, avg=0.03, max=0.20, sum=6.14 |
michael@0 | 39 | D4 200 min=0.00, avg=0.05, max=0.33, sum=9.70 |
michael@0 | 40 | A6 200 min=0.01, avg=0.09, max=0.36, sum=17.48 |
michael@0 | 41 | A7 200 min=0.00, avg=0.05, max=0.24, sum=10.07 |
michael@0 | 42 | |
michael@0 | 43 | 192 bits: |
michael@0 | 44 | A1 200 min=0.05, avg=0.45, max=3.13, sum=89.96 |
michael@0 | 45 | A2 200 min=0.04, avg=0.39, max=2.61, sum=77.55 |
michael@0 | 46 | B3 200 min=0.02, avg=0.18, max=1.25, sum=36.97 |
michael@0 | 47 | C4 200 min=0.01, avg=0.09, max=0.33, sum=18.24 |
michael@0 | 48 | D4 200 min=0.02, avg=0.15, max=0.54, sum=29.63 |
michael@0 | 49 | A6 200 min=0.02, avg=0.24, max=1.70, sum=47.84 |
michael@0 | 50 | A7 200 min=0.01, avg=0.15, max=1.05, sum=30.88 |
michael@0 | 51 | |
michael@0 | 52 | 256 bits: |
michael@0 | 53 | A1 200 min=0.08, avg=0.92, max=6.13, sum=184.79 |
michael@0 | 54 | A2 200 min=0.06, avg=0.76, max=5.03, sum=151.11 |
michael@0 | 55 | B3 200 min=0.04, avg=0.41, max=2.68, sum=82.35 |
michael@0 | 56 | C4 200 min=0.02, avg=0.19, max=0.69, sum=37.91 |
michael@0 | 57 | D4 200 min=0.03, avg=0.31, max=1.15, sum=63.00 |
michael@0 | 58 | A6 200 min=0.04, avg=0.48, max=3.13, sum=95.46 |
michael@0 | 59 | A7 200 min=0.03, avg=0.37, max=2.36, sum=73.60 |
michael@0 | 60 | |
michael@0 | 61 | 320 bits: |
michael@0 | 62 | A1 200 min=0.11, avg=1.59, max=6.14, sum=318.81 |
michael@0 | 63 | A2 200 min=0.09, avg=1.27, max=4.93, sum=254.03 |
michael@0 | 64 | B3 200 min=0.07, avg=0.82, max=3.13, sum=163.80 |
michael@0 | 65 | C4 200 min=0.04, avg=0.44, max=1.91, sum=87.59 |
michael@0 | 66 | D4 200 min=0.06, avg=0.73, max=3.22, sum=146.73 |
michael@0 | 67 | A6 200 min=0.07, avg=0.93, max=3.50, sum=185.01 |
michael@0 | 68 | A7 200 min=0.05, avg=0.76, max=2.94, sum=151.78 |
michael@0 | 69 | |
michael@0 | 70 | 384 bits: |
michael@0 | 71 | A1 200 min=0.16, avg=2.69, max=11.41, sum=537.89 |
michael@0 | 72 | A2 200 min=0.13, avg=2.15, max=9.03, sum=429.14 |
michael@0 | 73 | B3 200 min=0.11, avg=1.54, max=6.49, sum=307.78 |
michael@0 | 74 | C4 200 min=0.06, avg=0.81, max=4.84, sum=161.13 |
michael@0 | 75 | D4 200 min=0.10, avg=1.38, max=8.31, sum=276.81 |
michael@0 | 76 | A6 200 min=0.11, avg=1.73, max=7.36, sum=345.55 |
michael@0 | 77 | A7 200 min=0.09, avg=1.46, max=6.12, sum=292.02 |
michael@0 | 78 | |
michael@0 | 79 | 448 bits: |
michael@0 | 80 | A1 200 min=0.23, avg=3.36, max=15.92, sum=672.63 |
michael@0 | 81 | A2 200 min=0.17, avg=2.61, max=12.25, sum=522.86 |
michael@0 | 82 | B3 200 min=0.16, avg=2.10, max=9.83, sum=420.86 |
michael@0 | 83 | C4 200 min=0.09, avg=1.44, max=7.64, sum=288.36 |
michael@0 | 84 | D4 200 min=0.16, avg=2.50, max=13.29, sum=500.17 |
michael@0 | 85 | A6 200 min=0.15, avg=2.31, max=10.81, sum=461.58 |
michael@0 | 86 | A7 200 min=0.14, avg=2.03, max=9.53, sum=405.16 |
michael@0 | 87 | |
michael@0 | 88 | 512 bits: |
michael@0 | 89 | A1 200 min=0.30, avg=6.12, max=22.18, sum=1223.35 |
michael@0 | 90 | A2 200 min=0.25, avg=4.67, max=16.90, sum=933.18 |
michael@0 | 91 | B3 200 min=0.23, avg=4.13, max=14.94, sum=825.45 |
michael@0 | 92 | C4 200 min=0.13, avg=2.08, max=9.75, sum=415.22 |
michael@0 | 93 | D4 200 min=0.24, avg=4.04, max=20.18, sum=808.11 |
michael@0 | 94 | A6 200 min=0.22, avg=4.47, max=16.19, sum=893.83 |
michael@0 | 95 | A7 200 min=0.20, avg=4.03, max=14.65, sum=806.02 |
michael@0 | 96 | |
michael@0 | 97 | Modular Exponentation (metime) |
michael@0 | 98 | |
michael@0 | 99 | The following results are aggregated from 200 pseudo-randomly |
michael@0 | 100 | generated tests, based on a fixed seed. |
michael@0 | 101 | |
michael@0 | 102 | base, exponent, and modulus size (bits) |
michael@0 | 103 | P/C 128 192 256 320 384 448 512 640 768 896 1024 |
michael@0 | 104 | ------- ----------------------------------------------------------------- |
michael@0 | 105 | A1 0.015 0.027 0.047 0.069 0.098 0.133 0.176 0.294 0.458 0.680 1.040 |
michael@0 | 106 | A2 0.013 0.024 0.037 0.053 0.077 0.102 0.133 0.214 0.326 0.476 0.668 |
michael@0 | 107 | B3 0.005 0.011 0.021 0.036 0.056 0.084 0.121 0.222 0.370 0.573 0.840 |
michael@0 | 108 | C4 0.002 0.006 0.011 0.020 0.032 0.048 0.069 0.129 0.223 0.344 0.507 |
michael@0 | 109 | D4 0.004 0.010 0.019 0.034 0.056 0.085 0.123 0.232 0.390 0.609 0.899 |
michael@0 | 110 | E5 0.007 0.015 0.031 0.055 0.088 0.133 0.183 0.342 0.574 0.893 1.317 |
michael@0 | 111 | A6 0.008 0.016 0.038 0.042 0.064 0.093 0.133 0.239 0.393 0.604 0.880 |
michael@0 | 112 | A7 0.005 0.011 0.020 0.036 0.056 0.083 0.121 0.223 0.374 0.583 0.855 |
michael@0 | 113 | |
michael@0 | 114 | Multiplication and Squaring tests, (mulsqr) |
michael@0 | 115 | |
michael@0 | 116 | The following results are aggregated from 500000 pseudo-randomly |
michael@0 | 117 | generated tests, based on a per-run wall-clock seed. Times are given |
michael@0 | 118 | in seconds, except where indicated in microseconds (us). |
michael@0 | 119 | |
michael@0 | 120 | (A1) |
michael@0 | 121 | |
michael@0 | 122 | bits multiply square ad percent time/mult time/square |
michael@0 | 123 | 64 9.33 9.15 > 1.9 18.7us 18.3us |
michael@0 | 124 | 128 10.88 10.44 > 4.0 21.8us 20.9us |
michael@0 | 125 | 192 13.30 11.89 > 10.6 26.7us 23.8us |
michael@0 | 126 | 256 14.88 12.64 > 15.1 29.8us 25.3us |
michael@0 | 127 | 320 18.64 15.01 > 19.5 37.3us 30.0us |
michael@0 | 128 | 384 23.11 17.70 > 23.4 46.2us 35.4us |
michael@0 | 129 | 448 28.28 20.88 > 26.2 56.6us 41.8us |
michael@0 | 130 | 512 34.09 24.51 > 28.1 68.2us 49.0us |
michael@0 | 131 | 640 47.86 33.25 > 30.5 95.7us 66.5us |
michael@0 | 132 | 768 64.91 43.54 > 32.9 129.8us 87.1us |
michael@0 | 133 | 896 84.49 55.48 > 34.3 169.0us 111.0us |
michael@0 | 134 | 1024 107.25 69.21 > 35.5 214.5us 138.4us |
michael@0 | 135 | 1536 227.97 141.91 > 37.8 456.0us 283.8us |
michael@0 | 136 | 2048 394.05 242.15 > 38.5 788.1us 484.3us |
michael@0 | 137 | |
michael@0 | 138 | (A2) |
michael@0 | 139 | |
michael@0 | 140 | bits multiply square ad percent time/mult time/square |
michael@0 | 141 | 64 7.87 7.95 < 1.0 15.7us 15.9us |
michael@0 | 142 | 128 9.40 9.19 > 2.2 18.8us 18.4us |
michael@0 | 143 | 192 11.15 10.59 > 5.0 22.3us 21.2us |
michael@0 | 144 | 256 12.02 11.16 > 7.2 24.0us 22.3us |
michael@0 | 145 | 320 14.62 13.43 > 8.1 29.2us 26.9us |
michael@0 | 146 | 384 17.72 15.80 > 10.8 35.4us 31.6us |
michael@0 | 147 | 448 21.24 18.51 > 12.9 42.5us 37.0us |
michael@0 | 148 | 512 25.36 21.78 > 14.1 50.7us 43.6us |
michael@0 | 149 | 640 34.57 29.00 > 16.1 69.1us 58.0us |
michael@0 | 150 | 768 46.10 37.60 > 18.4 92.2us 75.2us |
michael@0 | 151 | 896 58.94 47.72 > 19.0 117.9us 95.4us |
michael@0 | 152 | 1024 73.76 59.12 > 19.8 147.5us 118.2us |
michael@0 | 153 | 1536 152.00 118.80 > 21.8 304.0us 237.6us |
michael@0 | 154 | 2048 259.41 199.57 > 23.1 518.8us 399.1us |
michael@0 | 155 | |
michael@0 | 156 | (B3) |
michael@0 | 157 | |
michael@0 | 158 | bits multiply square ad percent time/mult time/square |
michael@0 | 159 | 64 2.60 2.47 > 5.0 5.20us 4.94us |
michael@0 | 160 | 128 4.43 4.06 > 8.4 8.86us 8.12us |
michael@0 | 161 | 192 7.03 6.10 > 13.2 14.1us 12.2us |
michael@0 | 162 | 256 10.44 8.59 > 17.7 20.9us 17.2us |
michael@0 | 163 | 320 14.44 11.64 > 19.4 28.9us 23.3us |
michael@0 | 164 | 384 19.12 15.08 > 21.1 38.2us 30.2us |
michael@0 | 165 | 448 24.55 19.09 > 22.2 49.1us 38.2us |
michael@0 | 166 | 512 31.03 23.53 > 24.2 62.1us 47.1us |
michael@0 | 167 | 640 45.05 33.80 > 25.0 90.1us 67.6us |
michael@0 | 168 | 768 63.02 46.05 > 26.9 126.0us 92.1us |
michael@0 | 169 | 896 83.74 60.29 > 28.0 167.5us 120.6us |
michael@0 | 170 | 1024 106.73 76.65 > 28.2 213.5us 153.3us |
michael@0 | 171 | 1536 228.94 160.98 > 29.7 457.9us 322.0us |
michael@0 | 172 | 2048 398.08 275.93 > 30.7 796.2us 551.9us |
michael@0 | 173 | |
michael@0 | 174 | (C4) |
michael@0 | 175 | |
michael@0 | 176 | bits multiply square ad percent time/mult time/square |
michael@0 | 177 | 64 1.34 1.28 > 4.5 2.68us 2.56us |
michael@0 | 178 | 128 2.76 2.59 > 6.2 5.52us 5.18us |
michael@0 | 179 | 192 4.52 4.16 > 8.0 9.04us 8.32us |
michael@0 | 180 | 256 6.64 5.99 > 9.8 13.3us 12.0us |
michael@0 | 181 | 320 9.20 8.13 > 11.6 18.4us 16.3us |
michael@0 | 182 | 384 12.01 10.58 > 11.9 24.0us 21.2us |
michael@0 | 183 | 448 15.24 13.33 > 12.5 30.5us 26.7us |
michael@0 | 184 | 512 19.02 16.46 > 13.5 38.0us 32.9us |
michael@0 | 185 | 640 27.56 23.54 > 14.6 55.1us 47.1us |
michael@0 | 186 | 768 37.89 31.78 > 16.1 75.8us 63.6us |
michael@0 | 187 | 896 49.24 41.42 > 15.9 98.5us 82.8us |
michael@0 | 188 | 1024 62.59 52.18 > 16.6 125.2us 104.3us |
michael@0 | 189 | 1536 131.66 107.72 > 18.2 263.3us 215.4us |
michael@0 | 190 | 2048 226.45 182.95 > 19.2 453.0us 365.9us |
michael@0 | 191 | |
michael@0 | 192 | (A7) |
michael@0 | 193 | |
michael@0 | 194 | bits multiply square ad percent time/mult time/square |
michael@0 | 195 | 64 1.74 1.71 > 1.7 3.48us 3.42us |
michael@0 | 196 | 128 3.48 2.96 > 14.9 6.96us 5.92us |
michael@0 | 197 | 192 5.74 4.60 > 19.9 11.5us 9.20us |
michael@0 | 198 | 256 8.75 6.61 > 24.5 17.5us 13.2us |
michael@0 | 199 | 320 12.5 8.99 > 28.1 25.0us 18.0us |
michael@0 | 200 | 384 16.9 11.9 > 29.6 33.8us 23.8us |
michael@0 | 201 | 448 22.2 15.2 > 31.7 44.4us 30.4us |
michael@0 | 202 | 512 28.3 19.0 > 32.7 56.6us 38.0us |
michael@0 | 203 | 640 42.4 28.0 > 34.0 84.8us 56.0us |
michael@0 | 204 | 768 59.4 38.5 > 35.2 118.8us 77.0us |
michael@0 | 205 | 896 79.5 51.2 > 35.6 159.0us 102.4us |
michael@0 | 206 | 1024 102.6 65.5 > 36.2 205.2us 131.0us |
michael@0 | 207 | 1536 224.3 140.6 > 37.3 448.6us 281.2us |
michael@0 | 208 | 2048 393.4 244.3 > 37.9 786.8us 488.6us |
michael@0 | 209 | |
michael@0 | 210 | ------------------------------------------------------------------ |
michael@0 | 211 | This Source Code Form is subject to the terms of the Mozilla Public |
michael@0 | 212 | # License, v. 2.0. If a copy of the MPL was not distributed with this |
michael@0 | 213 | # file, You can obtain one at http://mozilla.org/MPL/2.0/. |