[stunnel-users] Premature socket closure - race condition bug?

Graham Nayler (work) graham.nayler at hallmarq.net
Tue Sep 23 22:54:54 CEST 2014


So it's been running for several days now and operationally it's fine for 
me.

I've looked a bit harder at the possible problems I outlined below, and this 
implementation does not show them....but I think sort of by accident.

Addressing the second problem first, what I was worrying about looks to be a 
result of my lack of understanding about how the poll works. If the HUP 
occurs after Stunnel has unloaded the socket, it looks like the poll does 
still complete with (POLLIN|POLLHUP), but the read length is 0. The zero 
length read is then taken as a signal to shut the socket down. (In my case, 
this is also actually the way the programs on either side of the Stunnel 
link detect socket shutdowns). So that's fine.

I'm not happy about the way the read-hup is handled in large transfers 
though. The attached logs (including printout of the relevant 'revents' 
fields after each poll completion, and byte transfer counts) show a number 
of scenarios:
1) Large transfer ( > SSL payload size), HUP received after unloading socket
2) Large transfer ( > SSL payload size), HUP received before unloading 
socket
3) Small transfer, HUP received along with socket read
4) Small transfer, HUP received after socket read and after write to SSL
5) Small transfer, HUP received after socket read and before write to SSL

Having read the code more closely I see that you're using a separate 
function s_poll_rdhup() for the read hangup, masking the poll return value 
with POLLRDHUP. Tracing through the return values on my system, this turns 
out to never return true, as the completion signal is only ever POLLHUP, 
never POLLRDHUP. And it's only because of this that Scenario 2) works; if 
that had detected a POLLRDHUP condition, it would have stopped reading the 
socket after its first read loop before reading all the data, and sent an 
incomplete payload. As the socket file descriptors are the same for both 
read and write, it also takes action on the HUP for the WR socket and tries 
to hang that up each time around the loop (this applies to both scenario 2 
and 3) - although I see that actually all it's doing is to clear a flag that 
allows the loop to terminate, so the only real effect is to cause the 
debug/log message to appear more often than necessary.

So for me it's working, but I have my doubts as to how general that solution 
is.....but again it may be my lack of understanding about the circumstances 
of seeing POLLRDHUP.

Graham

----- Original Message ----- 
From: "Graham Nayler (work)" <graham.nayler at hallmarq.net>
To: <stunnel-users at stunnel.org>
Sent: Monday, September 22, 2014 1:19 PM
Subject: Re: [stunnel-users] Premature socket closure - race condition bug?


> Ah..another possible problem with your solution...or alternatively with my 
> understanding of the socket interface. What happens if the remote program 
> submits the data, stunnel unloads the socket, and only then does the 
> remote program close the socket? Will the s_poll_wait be triggered on the 
> HUP alone (I guess yes)? If not, then stunnel will hang until something 
> submits more data. But if triggered only on the HUP, the lines checking 
> for internal failures will see an exit with no read or write data waiting, 
> and take the longjmp() after "s_poll_wait returned %d, but no descriptor 
> is ready". In my suggested version it did an additional check for 
> sock_rd_hup to cope with just this scenario.
>
> -----Original Message----- 
> From: Graham Nayler (work)
> Sent: Monday, September 22, 2014 12:54 PM
> To: stunnel-users at stunnel.org
> Subject: Re: [stunnel-users] Premature socket closure - race condition 
> bug?
>
> Since submitting my original patch, I've realised there is a possible
> scenario that both my solution, and I think yours as well, still fails on:
> we're assuming that once the readsocket call has returned, there's nothing
> left in the socket buffer. Is this not dependent on the size of the data
> submitted by the remote program, the capacity of the socket buffer itself,
> and the space remaining in c->sock_buf after c->sock_ptr? I.e. there could
> still be data in the socket if the buffered data were longer than that
> returned by readsocket? Should stunnel not repoll/check the socket to 
> ensure
> that POLLIN is not set before taking any action on POLLUP? Maybe
> s_poll_hup() should only be allowed to return POLLIN if POLLHUP is not 
> set?
>
> Graham
>
> -----Original Message----- 
> From: Graham Nayler (work)
> Sent: Monday, September 22, 2014 12:13 PM
> To: stunnel-users at stunnel.org
> Subject: Re: [stunnel-users] Premature socket closure - race condition 
> bug?
>
> Compiled and installed on the Linux Mint server and looking fine so far
> using my test example. If nothing shows up in the next few hours of normal
> running it should be fine.
>
> It's of lower priority, but I'd like to update the client machines as well
> when the release goes official. This may be better in a separate thread, 
> or
> already exist in an FAQ I've not found, but
> as I have about 50 (currently, eventually >80) sites to update (mostly
> Windows 7, some XPSP3), is there a way for the Windows install to be run
> unattended? Failing that, is there a list, or way to generate a list, of
> updated binaries + reg/config entries so I can make myself an unattended
> install/upgrade? Most sites are currently on 5.01, but some are still on
> 4.56.
>
> -----Original Message----- 
> From: Michal Trojnara
> Sent: Monday, September 22, 2014 11:00 AM
> To: stunnel-users at stunnel.org
> Subject: Re: [stunnel-users] Premature socket closure - race condition 
> bug?
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Graham Nayler wrote:
>> In the failure scenario though, the returned status was
>> (POLLIN|POLLHUP).
>
> Great work.  I indeed forgot that the POLLHUP condition may be set
> while some buffered data is still available.
>
> Could you try:
> https://www.stunnel.org/downloads/beta/stunnel-5.05b1.tar.gz
> ?
>
> Mike
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlQf80IACgkQ/NU+nXTHMtG3pACfeYvvrabt8TAM7CsOP1fm9nod
> /0UAoPpF7W5FkGjzb1u3E+6ZRHtoyi2L
> =p6UY
> -----END PGP SIGNATURE-----
> _______________________________________________
> stunnel-users mailing list
> stunnel-users at stunnel.org
> https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users
>
> _______________________________________________
> stunnel-users mailing list
> stunnel-users at stunnel.org
> https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users
>
> _______________________________________________
> stunnel-users mailing list
> stunnel-users at stunnel.org
> https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users
> _______________________________________________
> stunnel-users mailing list
> stunnel-users at stunnel.org
> https://www.stunnel.org/cgi-bin/mailman/listinfo/stunnel-users
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stunnel.log
Type: application/octet-stream
Size: 20965 bytes
Desc: not available
URL: <http://www.stunnel.org/pipermail/stunnel-users/attachments/20140923/1c327fdc/attachment.obj>


More information about the stunnel-users mailing list