[stunnel-users] (stunnel-4.14) poor performance with openssl engine

Zhuang Yuyao mlistz at gmail.com
Mon Apr 3 06:12:17 CEST 2006


I've just added openssl engine support for Cavium Nitrox lite PCI card. 
Here are the speed test output:

[engine] stands for speed test output with engine
[soft] stands for speed test output without engine (openssl internal 
software implementation used)

~ # openssl speed -evp [rc4|aes256|sha1] [-engine /root/libcavium.so]
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 
rc4[engine]               6339.20k    25376.00k   100966.40k  
3757056.00k   919961.60k
rc4[soft]               6056.66k     7017.34k     7228.37k     
7339.13k     7356.14k
aes-256-cbc[engine]       3178.40k    25465.60k    50636.80k   
194201.60k  2088140.80k
aes-256-cbc[soft]       2430.36k     2572.93k     2597.45k     
2617.27k     2609.82k
sha1[engine]              774.20k     3115.20k    16486.40k    
27033.60k   285593.60k
sha1[soft]                  915.74k     2622.71k     5678.93k     
7942.23k     9041.75k

~ # openssl speed rsa [-engine /root/libcavium.so]
                  sign    verify    sign/s verify/s
rsa[soft]  512 bits 0.013810s 0.001349s     72.4    741.2
rsa[soft] 1024 bits 0.079200s 0.004342s     12.6    230.3

                  sign    verify    sign/s verify/s
rsa[engine]  512 bits 0.000037s 0.000022s  26725.8  45181.5
rsa[engine] 1024 bits 0.000046s 0.000043s  21866.7  23489.1

The speed test shows than using engine is much much faster than software 
implementations. but while testing with stunnel, the result is really poor.
I am still using apache benchmark 2 as the test tool.
#ab2 -c 50 -n 100 (a html file whose 
size is 1Kbytes)
#ab2 -c 50 -n 100 (a html file whose 
size is 4Kbytes)
#ab2 -c 50 -n 100 (a html file whose 
size is 50Kbytes)
#ab2 -c 50 -n 100 (a html file whose 
size is 100Kbytes)
#ab2 -c 50 -n 100 (a html file whose 
size is 200Kbytes)

Here is the result:
Algorithm                   engine  1K    4K    50K    100K    200K   
AES-256-CBC+SHA1    no        14.12    13.21    7.92    4.48    2.17    
(Requests per second)
AES-256-CBC+SHA1    yes        51.10    23.34    7.30    3.12    1.91   
(Requests per second)

the data says everything. while using engine with stunnel, we got some 
performance increasement for processing small files, but processing 
bigger files is even slower than software implementation (it's quite 
strange, because proccessing large data blocks with hardware engine is 
assumed to be much faster than proccessing small data blocks). I did the 
same test for several times, but the result is the same.

something I can be sure:
1) while doing stunnel engine test, all of the algorithms are processed 
by the engine. I can see it by debugging information printed on console 
2) the client running ab2 is a AMD64 with 1G RAM, and the web server 
(apache) used in the test above is an XEON with 2G RAM. so the 
bottleneck should be neither ab2 nor apache.

So here are my questions: Has anyone tried stunnel with a hardware 
acceleartor? How about the performance?

BTW, attached to this mail is a little patch for stunnel-4.14 which 
allows the engine directive in stunnel.conf to be set to either a engine 
id or a shared object path refer to the engine (ie. engine = 

Best regards,

    Zhuang Yuyao
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: engine-load.patch
URL: <http://www.stunnel.org/pipermail/stunnel-users/attachments/20060403/74955a47/attachment.ksh>

More information about the stunnel-users mailing list