Hello,

I am evaluating stunnel, to see if it is a viable solution for providing encryption in a system that contains an Atmel processor which includes a HW accelerated encryption block.  I am just ramping up on stunnel, and figured I should capture what I have done so far.  My questions will come towards the end of my email.  

 

My research indicates that stunnel incorporates openssl.  I have been able to use openssl independently, to access the cryptodev HW encryption engine, in the Linux kernel module located in /lib/modules/4.14.79/extra/cryptodev.ko.  When openssl is run without accessing the cryptodev engine (cryptodev module not loaded), I get the pure SW encryption implementation provided by default in openssl.  When I run bench mark speed tests using openssl, using SW encryption, I see the following results:

 

# time -v openssl speed -evp aes-128-cbc

Doing aes-128-cbc for 3s on 16 size blocks: 1689887 aes-128-cbc's in 2.95s

Doing aes-128-cbc for 3s on 64 size blocks: 568389 aes-128-cbc's in 2.95s

Doing aes-128-cbc for 3s on 256 size blocks: 151550 aes-128-cbc's in 2.96s

Doing aes-128-cbc for 3s on 1024 size blocks: 38599 aes-128-cbc's in 2.96s

Doing aes-128-cbc for 3s on 8192 size blocks: 4845 aes-128-cbc's in 2.95s

OpenSSL 1.0.2p-fips  14 Aug 2018

built on: reproducible build, date unspecified

options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)

compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O3  -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM

The 'numbers' are in 1000s of bytes per second processed.

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes

aes-128-cbc       9165.49k    12331.15k    13107.03k    13353.17k    13454.32k

        Command being timed: "openssl speed -evp aes-128-cbc"

        User time (seconds): 14.81

        System time (seconds): 0.10

        Percent of CPU this job got: 99%

        Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.06s

        Average shared text size (kbytes): 0

        Average unshared data size (kbytes): 0

        Average stack size (kbytes): 0

        Average total size (kbytes): 0

        Maximum resident set size (kbytes): 13376

        Average resident set size (kbytes): 0

        Major (requiring I/O) page faults: 0

        Minor (reclaiming a frame) page faults: 145

        Voluntary context switches: 0

        Involuntary context switches: 721

        Swaps: 0

        File system inputs: 0

        File system outputs: 0

        Socket messages sent: 0

        Socket messages received: 0

        Signals delivered: 0

        Page size (bytes): 4096

        Exit status: 0

#

 

When I load the cryptodev module, and take advantage of the accelerated hardware encryption the benchmark tests are significantly faster.  Here is what those results look like.

 

# modprobe cryptodev

# time -v openssl speed -evp aes-128-cbc

Doing aes-128-cbc for 3s on 16 size blocks: 44163 aes-128-cbc's in 0.12s

Doing aes-128-cbc for 3s on 64 size blocks: 31345 aes-128-cbc's in 0.15s

Doing aes-128-cbc for 3s on 256 size blocks: 18923 aes-128-cbc's in 0.11s

Doing aes-128-cbc for 3s on 1024 size blocks: 13847 aes-128-cbc's in 0.13s

Doing aes-128-cbc for 3s on 8192 size blocks: 8427 aes-128-cbc's in 0.06s

OpenSSL 1.0.2p-fips  14 Aug 2018

built on: reproducible build, date unspecified

options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)

compiler: arm-laird-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -O3  -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -I/home/sii/wb50n_space2_legacy_6.0.0.x/wb/buildroot/output/wb50n_space2_legacy/host/arm-buildroot-linux-gnueabi/sysroot/usr/local/ssl/fips-2.0/include -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM

The 'numbers' are in 1000s of bytes per second processed.

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes

aes-128-cbc       5888.40k    13373.87k    44038.98k   109071.75k  1150566.40k

        Command being timed: "openssl speed -evp aes-128-cbc"

        User time (seconds): 0.59

        System time (seconds): 8.72

        Percent of CPU this job got: 61%

        Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.11s

        Average shared text size (kbytes): 0

        Average unshared data size (kbytes): 0

        Average stack size (kbytes): 0

        Average total size (kbytes): 0

        Maximum resident set size (kbytes): 13792

        Average resident set size (kbytes): 0

        Major (requiring I/O) page faults: 0

        Minor (reclaiming a frame) page faults: 144

        Voluntary context switches: 41154

        Involuntary context switches: 3321

       Swaps: 0

        File system inputs: 0

        File system outputs: 0

        Socket messages sent: 0

        Socket messages received: 0

        Signals delivered: 0

        Page size (bytes): 4096

        Exit status: 0

#

 

As can be seen in the results (hi-lighted in red), the average speed to do aes-128-cbc encryption jumped from around 2.95 s to 0.10 s.  Also of interest is the context switches are significantly higher when running hardware encryption, because of interrupts and overhead to use the hardware engine.  I can also look at /proc/interrupts and see significant increases in atmel-aes interrupt counts when using the cryptodev HW acceleration encryption engine.  This gives a good indication that the cryptodev module is in use, and is doing encryption.

 

I would like to try to figure out how to allow stunnel to take advantage of the cryptodev HW acceleration encryption engine available in openssl.  I have made some attempts, but so far, I have not been able to determine if stunnel is successfully using the cryptodev engine.  Here is what I have done with stunnel.  I already have a client and server successfully communicating with each other using stunnel.  To verify this I used the “nc” utility to send characters back and forth between two different machines.  The stunnel.conf file, on the server, is out of the box.  I’m interested in encrypting on the client side.  Here is my current client.conf file, in /etc/stunnel:

 

# cat client.conf

debug = 7

output = /tmp/stunnel-server.log

pid = /tmp/stunnel.pid

 

engine = cryptodev

 

[test]

verify = 1

client = yes

accept = 127.0.0.1:2000

connect = 192.168.0.220:30000

CAfile = /etc/stunnel/certificate.crt

engineNum = 1

 

#

 

I am attempting to set up the cryptodev to be the configured engine for the client.  I am able to start stunnel, using client.conf, as follows:

 

# stunnel /etc/stunnel/client.conf

#

 

If I do a “ps” command to display processes, I can see that stunnel is running in the background.  At this point, I can use “nc” to send data, as follows:

 

# nc 127.0.0.1 2000 < /tmp/long_file.txt

 

I am able to see the text from long_file.txt on the server, which is also running nc.  The problem is that I don’t see interrupts increasing in /proc/interrupts, which leaves me wondering if I have not configured stunnel correctly to use the cryptodev engine.  If I try to remove the cryptodev module as this point, while stunnel is running, I receive a message that it is in use, as follows:

 

# modprobe -r cryptodev

modprobe: FATAL: Module cryptodev is in use.

#

 

If I kill the stunnel process, I am able to successfully remove the cryptodev module, which seems to suggest stunnel is the process using the cryptodev module.  Also, once I have removed the cryptodev module, I can’t restart stunnel.  Instead, I get the following errors back:

 

# stunnel /etc/stunnel/client.conf

[.] stunnel 5.44 on arm-buildroot-linux-gnueabi platform

[.] Compiled/running with OpenSSL 1.0.2p-fips  14 Aug 2018

[.] Threading:FORK Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI

[ ] errno: (*__errno_location ())

[.] Reading configuration from file /etc/stunnel/client.conf

[.] UTF-8 byte order mark not detected

[ ] Enabling support for engine "cryptodev"

[!] error queue: 2606A074: error:2606A074:engine routines:ENGINE_by_id:no such engine

[!] error queue: 260B6084: error:260B6084:engine routines:DYNAMIC_LOAD:dso not found

[!] error queue: 25070067: error:25070067:DSO support routines:DSO_load:could not load the shared library

[!] ENGINE_by_id: 25066067: error:25066067:DSO support routines:DLFCN_LOAD:could not load the shared library

[!] /etc/stunnel/client.conf:5: "engine = cryptodev": Failed to open the engine

#

 

Again, this suggests stunnel is trying to use cryptodev.  I just don’t know how to prove I am taking advantage of the HW encryption acceleration engine.  I never see interrupts updating in /proc/interrupts when using nc, while stunnel is running.

 

So, here are my questions:

 

1.)    Does it look like I have things set up correctly in client.conf, to use the cryptodev engine?

2.)    If client.conf is correct, how can I prove that stunnel is using the cryptodev engine, since I don’t see the expected interrupts?

 

One idea is that the cryptodev module might not support the type of encryption being requested by the certificate, so openssl falls back to the pure SW encryption implementation.   I know the Atmel chip in question supports the following:

 

# openssl engine –t –c

(cryptodev) cryptodev engine

[RSA, DSA, DH, DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, MD5, SHA1, SHA256, SHA384, SHA512]

      [ available ]

(dynamic) Dynamic engine loading support

      [ unavailable ]

#

 

I was able to decode the contents of the certificate, and it says it is sha256WithRSAEncryption.  My engine supports SHA256 and RSA, but does it support combining, like SHA256WithRSA?  I’m not sure.  I’ll keep chasing that one.

 

Thanks for any guidance on how to use the cryptodev in stunnel.

 

Regards,

Tamar