[stunnel-users] Stunnel hangs on big flows of data

Dario Mariani dario.mariani at sun-cs-italy.com
Fri Oct 27 12:41:23 CEST 2006


Il giorno 27/ott/06, alle ore 09:31, Michal Trojnara ha scritto:

> On Thursday 26 October 2006 17:54, Dario Mariani wrote:
>> I'm deploying stunnel on some servers.
>> I did some tests, and i never had problems.
>> For example, i tried 5k parallel connections,
>> and i didn't have any problem.
>
> Thank you for the information.
> What is your platform (hardware, operating system)?

Well, it was only a test that i did to help a user understand the  
concept of ulimit :)
with the tunneling on (and an "ulimit -n 8192 in the stunnel.init  
script :) ) , we did 5k "telnet localhost 10001 &" or something  
similar, it wasn't a big stress test...
the system worked like a charm, all connections went fine (until the  
oracle listener closed connection, of course), and without problems.
I don't remember the system, but it was a solaris 9 on big iron (i  
think some sunfire 6800-15k-25k with 52 or 56 1.2-1.33g cpus and more  
than 200g of ram)

In some days (about the end of the _next_ week, sorry) i can give to  
you the results of some performance tests we did... but they are not  
very deep in details :)

>
>> 2006.10.20 16:00:58 LOG7[20302:75]: SSL_read returned WANT_READ:
>> retrying
> [cut]
>> It did complete correctly within a pair of minutes on an ibook 64
>> 1.33 1g ram, but with LOADS of want_read and want_write errors on
>> both sides of stunnel.
>
> They're not errors!  They're debug (LOG7) messages.
> The message does not indicate anything wrong by itself.
>
> Debugging should be only enabled when you're trying
> to diagnose a problem - not in a production system.
>
> What is the problem (besides those debug messages)?

The problem is this:
the system works well for about 45min, then gives these messages and  
hangs.
Simple and useless :(
The traffic "shape" is that of a datawarehouse, with a little number  
of connections (i think few 10s), that carries a good load of traffic  
from the db (stunnel server) to the appserver (stunnel client), with  
peaks every 15min. And i _think_ that sometimes there are big uploads  
(sql updates) from the client to the server.
This is what i understood asking :)

Now, i'm a little confused... the server started giving these debug  
messages, and then HUNG HORRIBLY within minutes. :)
With the tests that i made on my laptop, i had those debug messages,  
but it all worked well and in expected times (the path netcat  120m  
file -> stunnel client -> stunnel server -> openssl s_server >/dev/ 
null took 20 seconds!!! )
So, i think at this point the problem isn't the WANT_READ debug  
messaged, but something that can be (or not) related to this.

What i'm asking is:
- what these messages _exactly_ means? reading some openssl related  
forums, i saw that this message is sent by the server when the read  
buffer is empty and the server is awaiting data.
- do you have any idea on what topic i can direct my analysis?


> How can I reproduce the hang mentioned int the subject?

Well, i have some problems with this point:
i CANNOT put up stunnel on the system that had the problem, until i  
fix the problem  :(

Excuse me for my lack of precision and details, but these are chaotic  
days here :)

Bye, dario.



More information about the stunnel-users mailing list