[stunnel-users] stunnel-4.20 server loop on AIX 5.3

Peter Heimann heimannp at web.de
Mon May 28 20:51:39 CEST 2007


Under heavy usage, stunnel-4.20 on AIX 5.3 enters into a loop with high
CPU load.

Test setup:

        client ------------> gateway ------------> server
ApacheBench 2.0.40-dev     stunnel-4.20          Apache 1.3
                          in server mode
                          OpenSSL 0.9.8d

stunnel 4.20 on powerpc-ibm-aix5.3.0.0 with OpenSSL 0.9.8d 28 Sep 2006
Threading:PTHREAD SSL:ENGINE Sockets:POLL,IPv4

; stunnel configuration file, server mode
cert = /etc/stunnel.pem
setuid = stunnel
setgid = stunnel
pid = /tmp/stunnel.pid
debug = local3.notice
output = /tmp/stunnel.log
[https]
accept  = gateway:4433
connect = server:80


The first client requests are processed without error, then some connections
fail:

 % ab -n 1000 -c 70 https://gateway:4433/
 This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
 [...]
 Test aborted after 10 failures
 Apr_socket_connect(): Invalid argument (22)
 Total of 705 requests completed

One or more connections stay in state CLOSE_WAIT, and stunnel is chewing
away CPU cycles (CPU usage >99%).

 % lsof -a -i -c stunnel
 COMMAND    PID    USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
 stunnel 196856 stunnel    6u  IPv4 0xf100020002feeb98      0t0  TCP gateway:4433
 (LISTEN)
 stunnel 196856 stunnel   46u  IPv4 0xf100020002e98398      0t0  TCP gateway:4433
->server:50271 (CLOSE_WAIT)

In this state, attaching a debugger shows that stunnel is stuck in a
loop in init_ssl() within client.c:

  +321      while(1) {
  +322          if(c->opt->option.client)
  +323              i=SSL_connect(c->ssl);
  +324          else
  +325              i=SSL_accept(c->ssl);
  +326          err=SSL_get_error(c->ssl, i);
  /* err==5 */
  /* ... */
  +349          if(err==SSL_ERROR_SYSCALL) {
  +350              switch(get_last_socket_error()) {
  +351              case EINTR:
  +353              case EAGAIN:
  /* loop continues */
  +354                  continue;
  +355              }
  +356          }

SSL_accept() returns with error, SSL_get_error() returns 5 (SSL_ERROR_SYSCALL).
get_last_socket_error() returns EAGAIN, and the loop continues without end.
Nevertheless, new connection requests still get processed.

Even before CPU usage goes up, for every new connection, SSL_accept()
results in SSL_ERROR_SYSCALL for 20 to 1800 times before finally
data is exchanged.

stunnel enters this endless loop on AIX both with OpenSSL 0.9.8d and
OpenSSL 0.9.8e. If I replace the AIX gateway machine with one running
stunnel-4.20 on Solaris, I do not experience any problems.

What might trigger this loop on AIX 5.3? How can it be avoided?

--
Peter Heimann

_______________________________________________________________
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192




More information about the stunnel-users mailing list