michael@0: I will, eventually convert all files here to html - just right now I have no michael@0: time to do it. Anyone who'd like to - please feel free, mail me the file and michael@0: I will check it in michael@0: sonmi@netscape.com michael@0: michael@0: michael@0: The NSS 3.1 SSL Stress Tests fail for me on FreeBSD 3.5. The end of the output michael@0: of './ssl.sh stress' looks like this: michael@0: michael@0: ********************* Stress Test **************************** michael@0: ********************* Stress SSL2 RC4 128 with MD5 **************************** michael@0: selfserv -p 8443 -d michael@0: /local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n michael@0: conrail.cs.columbia.edu -w nss -i /tmp/tests_pid.5505 & strsclnt -p 8443 -d . -w nss -c 1000 -C A conrail.cs.columbia.edu michael@0: strsclnt: -- SSL: Server Certificate Validated. michael@0: strsclnt: PR_NewTCPSocket returned error -5974: michael@0: Insufficient system resources. michael@0: Terminated michael@0: ********************* Stress SSL3 RC4 128 with MD5 **************************** michael@0: selfserv -p 8443 -d michael@0: /local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n michael@0: conrail.cs.columbia.edu -w nss -i /tmp/tests_pid.5505 & strsclnt -p 8443 -d . -w nss -c 1000 -C c conrail.cs.columbia.edu michael@0: strsclnt: -- SSL: Server Certificate Validated. michael@0: strsclnt: PR_NewTCPSocket returned error -5974: michael@0: Insufficient system resources. michael@0: Terminated michael@0: michael@0: Running ktrace on the process (ktrace is a system-call tracer, the equivalent of michael@0: Linux's strace) reveals that socket() failed with ENOBUFS after it was called michael@0: for the 953rd time for the first test, and it failed after the 27th time it was michael@0: called for the second test. michael@0: michael@0: The failure is consistent, both for debug and optimized builds; I haven't tested michael@0: to see whether the count of socket() failures is consistent. michael@0: michael@0: All the other NSS tests pass successfully. michael@0: michael@0: michael@0: ------- Additional Comments From Nelson Bolyard 2000-11-01 23:08 ------- michael@0: michael@0: I see no indication of any error on NSS's part from this description. michael@0: It sounds like an OS kernel configuration problem on the michael@0: submittor's system. The stress test is just that. It stresses michael@0: the server by pounding it with SSL connections. Apparently this michael@0: test exhausts some kernel resource on the submittor's system. michael@0: michael@0: The only change to NSS that might be beneficial to this test michael@0: would be to respond to this error by waiting and trying again michael@0: for some limited number of times, rather than immediately michael@0: treating it as a fatal error. michael@0: michael@0: However, while such a change might make the test appear to pass, michael@0: it would merely be hiding a very serious problem, namely, michael@0: chronic system resource exhaustion. michael@0: michael@0: So, I suggest that, in this case, the failure serves the useful michael@0: purpose of revealing the system problem, which needs to be michael@0: cured apart from any changes to NSS. michael@0: michael@0: I'll leave this bug open for a few more days, to give others michael@0: a chance to persuade me that some NSS change would and should michael@0: solve this problem. michael@0: michael@0: michael@0: ------- Additional Comments From Jonathan Lennox 2000-11-02 13:13 ------- michael@0: michael@0: Okay, some more investigation leads me to agree with you. What's happening is michael@0: that the TCP connections from the stress test stick around in TIME_WAIT for two michael@0: minutes; my kernel is only configured to support 1064 simultaneous open sockets, michael@0: which isn't enough for the 2K sockets opened by the stress test plus the 100 or michael@0: so normally in use on my system. michael@0: michael@0: So I'd just suggest adding a note to the NSS test webpage to the effect of "The michael@0: SSL stress test opens 2,048 TCP connections in quick succession. Kernel data michael@0: structures may remain allocated for these connections for up to two minutes. michael@0: Some systems may not be configured to allow this many simulatenous connections michael@0: by default; if the stress tests fail, try increasing the number of simultaneous michael@0: sockets supported." michael@0: michael@0: On FreeBSD, you can display the number of simultaneous sockets with the command michael@0: sysctl kern.ipc.maxsockets michael@0: which on my system returns 1064. michael@0: michael@0: It looks like this can be fixed with the kernel config option michael@0: options NMBCLUSTERS=[something-large] michael@0: or by increasing the 'maxusers' parameter. michael@0: michael@0: It looks like more recent FreeBSD implementations still have this limitation, michael@0: and the same solutions apply, plus you can alternatively specify the maxsockets michael@0: parameter in the boot loader. michael@0: michael@0: michael@0: --------------------------------- michael@0: michael@0: hpux HP-UX hp64 B.11.00 A 9000/800 2014971275 two-user license michael@0: michael@0: we had to change following kernelparameters to make our tests pass michael@0: michael@0: 1. maxfiles. old value = 60. new value = 100. michael@0: 2. nkthread. old value = 499. new value = 1328. michael@0: 3. max_thread_proc. old value = 64. new value = 512. michael@0: 4. maxusers. old value = 32. new value = 64. michael@0: 5. maxuprc. old value = 75. new value = 512. michael@0: 6. nproc. old formula = 20+8*MAXUSERS, which evaluated to 276. michael@0: new value (note: not a formula) = 750. michael@0: michael@0: A few other kernel parameters were also changed automatically michael@0: as a result of the above changes. michael@0: michael@0: