security/nss/tests/doc/platform_specific_problems

changeset 0
6474c204b198
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/security/nss/tests/doc/platform_specific_problems	Wed Dec 31 06:09:35 2014 +0100
     1.3 @@ -0,0 +1,110 @@
     1.4 +I will, eventually convert all files here to html - just right now I have no
     1.5 +time to do it. Anyone who'd like to - please feel free, mail me the file and 
     1.6 +I will check it in 
     1.7 +sonmi@netscape.com
     1.8 +
     1.9 +
    1.10 +The NSS 3.1 SSL Stress Tests fail for me on FreeBSD 3.5.  The end of the output
    1.11 +of './ssl.sh stress' looks like this:
    1.12 +
    1.13 +********************* Stress Test  ****************************
    1.14 +********************* Stress SSL2 RC4 128 with MD5 ****************************
    1.15 +selfserv -p 8443 -d
    1.16 +/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
    1.17 +conrail.cs.columbia.edu -w nss   -i /tmp/tests_pid.5505  & strsclnt -p 8443 -d . -w nss -c 1000 -C A  conrail.cs.columbia.edu 
    1.18 +strsclnt: -- SSL: Server Certificate Validated.
    1.19 +strsclnt: PR_NewTCPSocket returned error -5974:
    1.20 +Insufficient system resources.
    1.21 +Terminated 
    1.22 +********************* Stress SSL3 RC4 128 with MD5 ****************************
    1.23 +selfserv -p 8443 -d
    1.24 +/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
    1.25 +conrail.cs.columbia.edu -w nss   -i /tmp/tests_pid.5505  & strsclnt -p 8443 -d . -w nss -c 1000 -C c  conrail.cs.columbia.edu 
    1.26 +strsclnt: -- SSL: Server Certificate Validated.
    1.27 +strsclnt: PR_NewTCPSocket returned error -5974:
    1.28 +Insufficient system resources.
    1.29 +Terminated 
    1.30 +
    1.31 +Running ktrace on the process (ktrace is a system-call tracer, the equivalent of
    1.32 +Linux's strace) reveals that socket() failed with ENOBUFS after it was called
    1.33 +for the 953rd time for the first test, and it failed after the 27th time it was
    1.34 +called for the second test.
    1.35 +
    1.36 +The failure is consistent, both for debug and optimized builds; I haven't tested
    1.37 +to see whether the count of socket() failures is consistent.
    1.38 +
    1.39 +All the other NSS tests pass successfully.
    1.40 +
    1.41 +
    1.42 +------- Additional Comments From Nelson Bolyard 2000-11-01 23:08 -------
    1.43 +
    1.44 +I see no indication of any error on NSS's part from this description.
    1.45 +It sounds like an OS kernel configuration problem on the 
    1.46 +submittor's system.  The stress test is just that.  It stresses
    1.47 +the server by pounding it with SSL connections.  Apparently this 
    1.48 +test exhausts some kernel resource on the submittor's system.
    1.49 +
    1.50 +The only change to NSS that might be beneficial to this test 
    1.51 +would be to respond to this error by waiting and trying again
    1.52 +for some limited number of times, rather than immediately 
    1.53 +treating it as a fatal error.  
    1.54 +
    1.55 +However, while such a change might make the test appear to pass,
    1.56 +it would merely be hiding a very serious problem, namely,
    1.57 +chronic system resource exhaustion.
    1.58 +
    1.59 +So, I suggest that, in this case, the failure serves the useful
    1.60 +purpose of revealing the system problem, which needs to be 
    1.61 +cured apart from any changes to NSS.
    1.62 +
    1.63 +I'll leave this bug open for a few more days, to give others
    1.64 +a chance to persuade me that some NSS change would and should
    1.65 +solve this problem.  
    1.66 +
    1.67 +
    1.68 +------- Additional Comments From Jonathan Lennox 2000-11-02 13:13 -------
    1.69 +
    1.70 +Okay, some more investigation leads me to agree with you.  What's happening is
    1.71 +that the TCP connections from the stress test stick around in TIME_WAIT for two
    1.72 +minutes; my kernel is only configured to support 1064 simultaneous open sockets,
    1.73 +which isn't enough for the 2K sockets opened by the stress test plus the 100 or
    1.74 +so normally in use on my system.
    1.75 +
    1.76 +So I'd just suggest adding a note to the NSS test webpage to the effect of "The
    1.77 +SSL stress test opens 2,048 TCP connections in quick succession.  Kernel data
    1.78 +structures may remain allocated for these connections for up to two minutes. 
    1.79 +Some systems may not be configured to allow this many simulatenous connections
    1.80 +by default; if the stress tests fail, try increasing the number of simultaneous
    1.81 +sockets supported."
    1.82 +
    1.83 +On FreeBSD, you can display the number of simultaneous sockets with the command
    1.84 +	sysctl kern.ipc.maxsockets
    1.85 +which on my system returns 1064.
    1.86 +
    1.87 +It looks like this can be fixed with the kernel config option
    1.88 +	options NMBCLUSTERS=[something-large]
    1.89 +or by increasing the 'maxusers' parameter.
    1.90 +
    1.91 +It looks like more recent FreeBSD implementations still have this limitation,
    1.92 +and the same solutions apply, plus you can alternatively specify the maxsockets
    1.93 +parameter in the boot loader.
    1.94 +
    1.95 +
    1.96 +---------------------------------
    1.97 +
    1.98 +hpux HP-UX hp64 B.11.00 A 9000/800 2014971275 two-user license
    1.99 +
   1.100 +we had to change following kernelparameters to make our tests pass
   1.101 +
   1.102 +1. maxfiles.  old value = 60.  new value = 100.
   1.103 +2. nkthread.  old value = 499.  new value = 1328.
   1.104 +3. max_thread_proc.  old value = 64.  new value = 512.
   1.105 +4. maxusers.  old value = 32.  new value = 64.
   1.106 +5. maxuprc.  old value = 75.  new value = 512.
   1.107 +6. nproc.  old formula = 20+8*MAXUSERS, which evaluated to 276.
   1.108 +   new value (note: not a formula) = 750.
   1.109 +
   1.110 +A few other kernel parameters were also changed automatically
   1.111 +as a result of the above changes.
   1.112 +
   1.113 +

mercurial