[stunnel-users] Problem with roundrobbin failover mode?

Matt Wise matt at nextdoor.com
Mon Jan 7 18:38:55 CET 2013


I've got dozens of clients connecting with Stunnel to a group of 5 servers. Each system has a config that looks like this:

> cert = /etc/stunnel/zookeeper.pem
> key = /etc/stunnel/zookeeper.key
> CAfile = /etc/stunnel/zookeeper_ca.pem
> verify = 2
> delay = yes
> sslVersion = TLSv1
> client = yes
> setuid = stunnel4
> setgid = stunnel4
> pid = /var/lib/stunnel4/zookeeper.stunnel4.pid
> socket = l:TCP_NODELAY=1
> socket = r:TCP_NODELAY=1
> TIMEOUTconnect = 2
> session = 86400
> debug = 5
> [zookeeper]
> accept  = 127.0.0.1:2182
> failover = rr
> connect = prod-zookeeper:2182
> connect = prod-zookeeper-1:2182
> connect = prod-zookeeper-2:2182
> connect = prod-zookeeper-3:2182
> connect = prod-zookeeper-4:2182
> connect = prod-zookeeper-5:2182


Essentially the first host is a load balancer, and the next 5 are the actual zookeeper hosts so that we can bypass the ELB if its giving us fits. Now what we're seeing is that almost every connection ends up on prod-zookeeper-5. Over and over and over again, our hosts pick the same system each time. We're running Stunnel 4.52:

> Clients allowed=8000
> stunnel 4.52 on i486-pc-linux-gnu platform
> Compiled/running with OpenSSL 0.9.8k 25 Mar 2009
> Threading:PTHREAD SSL:ENGINE Auth:LIBWRAP Sockets:POLL,IPv6


Any ideas what might be wrong here? Obviously we want the connections to be *roughly* random across the list of hosts... and if one of the hosts goes down, and the connection fails, we want the stunnel service to try again, and randomly pick a new host. It doesn't really seem to be doing that though. 

--Matt




More information about the stunnel-users mailing list