On Tue, Jun 10, 2025 at 10:21 PM Michał Trojnara via stunnel-users [email protected] wrote:
Hi Guys,
For the stunnel project, my priorities are:
- Security
- Reliability
- Scalability
- Performance
After approximately 27 years of active development, I believe stunnel performs well in all of these areas. However, I would never sacrifice security to improve reliability, never sacrifice reliability to improve scalability, and would be unlikely to sacrifice scalability to improve performance.
Security and reliability are hopefully self-explanatory. By scalability, I mean the ability to efficiently handle as many concurrent connections as possible on a given hardware platform—from resource-constrained IoT devices to high-end servers. This involves making effective use of available file descriptors and RAM, while minimizing thread synchronization overhead.
There are two primary sources of performance bottlenecks:
- Connection rate – The number of new connections established per
second. Establishing a new connection involves asymmetric cryptography, which is computationally expensive and therefore typically the main performance bottleneck. 2. Connection throughput – The data rate of a single connection, which—at least in user space—is handled by a single CPU thread. This is almost always faster than the network interface’s throughput and thus rarely a limiting factor.
Hi,
Thank you for the quick response, and for stating the goals so clearly. I think my use-case fits within 'Performance bottlenecks on connection throughput on high-end servers':
The reason I started looking into stunnel performance is because it is using 96-100% CPU and is a bottleneck for the application I'm working on, even on relatively slower (by today's standards) 10Gbit/s networks. On faster networks with 25Gbit/s, 40Gbit/s or 100Gbit/s you can see how this limitation is even more serious. The bottleneck isn't in encryption/decryption, but in all the "overhead" around it (handling buffers, allocating/deallocating memory too often, making system calls too often, etc.), which was surprising, I would've expected encryption to take most of the time. But in some sense this is actually good news, because all that "overhead" can be reduced/improved, reducing encryption/decryption time wouldn't really be possible (it is already using AES-NI CPU instructions).
I probably should've started my emails by explaining that.
Probably more of a problem with servers and datacenters than end-users (whom would rarely have an internet connection faster than 1Gbit/s), but I'm glad that high-end servers are also on your goals list.
The current OpenSSL performance settings were selected because they have been extensively tested by numerous users across many OpenSSL versions and network stacks. Any change would require a compelling reason.
The use case is live migrating VMs, where the connection between 2 hosts is encrypted using stunnel. Connection rate isn't very important for this use case, but the connection throughput is. In some cases there might be a tradeoff between the two (e.g. memory usage vs performance), it'd be nice to have .conf flags to choose between the two. Although I think when stunnel runs in *client* mode throughput would probably be more important than using a little bit more memory.
Of course migrating more than one VM at a time (and thus using more than 1 HTTPS connection at a time) can work around this limitation up to a point, but would be good to fix the performance bottlenecks that are easily fixable by tweaking OpenSSL settings and buffer sizes.
So far my tweaks achieve an 17% performance improvement (I don't know whether that satisfies the threshold for compelling), but it should be possible to gain even more.
I'll know more once I finished the rest of my patches, but comparing nginx vs stunnel as a server (with curl as a client) shows that nginx can do 20Gbit/s (on a single stream), and a patched stunnel can do ~18Gbit/s, so there is potentially more performance to be gained by improving stunnel (also with a now patched curl client I should be able to achieve more on both).
(I'd prefer to improve what we already have, i.e. stunnel. I only used nginx as a performance reference, because I was looking what is the fastest that I can achieve using OpenSSL on my hardware, and nginx is currently the fastest among nginx, hitch, socat and stunnel. I don't see a compelling reason to switch away from stunnel though).
There are also other approaches that my application could've taken, e.g. using Wireguard, but they seem considerably more risky (wireguard is in the kernel, so if there is a bug then the whole host can crash). I agree with your priorities that reliability is more important than performance. Although from a testing point of view I can try adding it to my comparison, at least to have a "max achievable performance on this HW" reference point to strive towards.
Best regards, --Edwin
Best regards, Mike
hshh wrote:
Tested the patch, which will cause some applications to have issues after connecting via Stunnel. For example, connecting to SSH via Stunnel, the SSH client's display has a problem.
Sounds like similar issues that I've been having in curl (i.e. this change may expose latent bugs, either in OpenSSL, the application or stunnel). I think the safest route would be to have this disabled by default, with an stunnel.conf flag that applications can enable if they know this'd work for their protocol/implementation.
Thank you, --Edwin
On Tue, Jun 10, 2025 at 5:46 PM Edwin Torok via stunnel-users [email protected] wrote:
Hello,
Flamegraph profiling on stunnel has shown that most time is NOT spent in encryption/decryption, but in sending/receiving data.
Experimental setup:
CPU: 2x 18-core Intel Xeon Gold 6354 NIC: Intel E810-XXV, 100 Gbit/s Kernel: 6.14.8-300.fc42.x86_64 x86_64 Mem: 251.32 GiB OS: Fedora Linux 42 openssl version: OpenSSL 3.2.4 11 Feb 2025 (Library: OpenSSL 3.2.4 11 Feb 2025) stunnel: 5.75 from git
Network, encryption and memory copy speeds converted into Gbit/s on the above HW:
iperf3: 40Gbit/s openssl speed -evp aes-256-gcm: 72Gbit/s perf bench memcpy: 100Gbit/s
Under ideal conditions we should be able to achieve at most ~20.5Gbit/s, however I only measured ~7.6Gbit/s.
The scripts that I used to perform measurements are included in the attached patch 3.
See also related discussion on curl: https://github.com/curl/curl/pull/17548
Enabling readahead, avoiding excessive alloc/free of buffers, increasing buffer sizes, and improving buffer handling can all improve performance.
For now I'm sending just the first small patches as proofs of concept, each of them one-liners. I hope you can include these in some form in the next release.
Eventually these should probably be exposed as configuration flags in stunnel.conf, I can help implementing that, or I can leave implementing that to you if you prefer.
In fact readahead was enabled in previous versions of stunnel, and then disabled again, so I assume you ran into some bugs with it on certain protocols?
Increasing the buffer size is not a clear win (as with curl), due to the excessive use of memmove/memset, and that SSL_read/SSL_write only still processes 1 TLS record at a time. I have some further changes that can improve that, but the patches are larger and I haven't finished testing them for correctness yet. Please let me know if you'd want these as patches, or if you'd rather implement them yourself.
Edwin Török (3): openssl: enable readahead openssl: disable SSL_MODE_RELEASE_BUFFERS Benchmark scripts
src/ctx.c | 5 +- tests/benchmark/benchmark.sh | 120 +++++++++++++++++++++++ tests/benchmark/launch.sh | 185 +++++++++++++++++++++++++++++++++++ 3 files changed, 309 insertions(+), 1 deletion(-) create mode 100644 tests/benchmark/benchmark.sh create mode 100755 tests/benchmark/launch.sh
-- 2.43.5 _______________________________________________ stunnel-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
stunnel-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
stunnel-users mailing list -- [email protected] To unsubscribe send an email to [email protected]