Hello,

 

We are using stunnel 5.71 on Solaris 10 to tunnel NFS on more than 50 systems.

In the past we saw stunnel dumping cores without any errors or hints in the stunnel log file.

It was believed to be an OpenSSL 1.x problem.

We got an stunnel version with SSL 2.x and 3.x but we had to go back to 1.x because of other problems.

 

We noticed the dumps to happen at regular times and found a script that made NFS requests every 10 Min.

After changing the script so there are less NFS requests it was stable. It looked like the problem accumulated over time and a counter or something alike reached its limit which then resulted in the core dump.

 

Then the core dumps started again. It still seems to be related to the amount of requests because the problem occurs on the two NFS Servers with the highest number of connections. This time, we see errors in the stunnel log. The error slightly varies:

 

2023.10.12 22:05:43 LOG3[10745]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac

 

// There were also a few cases with an additional "internal error". I.e:

 

2023.10.16 00:23:53 LOG5[2968]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:38270

2023.10.16 00:40:19 LOG3[2968]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac

2023.10.16 00:40:19 LOG3[2967]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac

2023.10.16 00:40:19 LOG5[2968]: Connection reset: 2476 byte(s) sent to TLS, 3324 byte(s) sent to socket

2023.10.16 00:40:19 LOG5[2967]: Connection reset: 2500 byte(s) sent to TLS, 3100 byte(s) sent to socket

INTERNAL ERROR: Bad magic at ssl.c, line 192

2023.10.16 00:40:20 LOG6[ui]: Initializing inetd mode configuration

 

// or

2023.10.23 07:33:53 LOG5[62]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:50626

2023.10.23 07:38:37 LOG3[62]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac

2023.10.23 07:38:37 LOG5[62]: Connection reset: 364 byte(s) sent to TLS, 336 byte(s) sent to socket

...

2023.10.23 07:44:37 LOG5[61]: Connection reset: 1704 byte(s) sent to TLS, 2360 byte(s) sent to socket

INTERNAL ERROR: Bad magic at OpenSSL, line 0

 

// or

2023.10.24 08:20:59 LOG5[52]: Service [tls-nfs-srv] connected remote server from 127.0.0.1:43966

2023.10.24 08:56:13 LOG6[52]: Read socket closed (readsocket)

...

2023.10.24 09:00:01 LOG5[53]: Service [tls-nfs-srv] accepted connection from 1.2.3.4:59973

2023.10.24 09:00:01 LOG6[53]: Peer certificate not required

2023.10.24 09:00:01 LOG3[49]: SSL_read: s3_pkt.c:534: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac

2023.10.24 09:00:01 LOG5[49]: Connection reset: 7112 byte(s) sent to TLS, 9188 byte(s) sent to socket

INTERNAL ERROR: Dead canary at OpenSSL, line 0

 

 

I have attached a file with more details.

 

Although this looks like an OpenSSL problem at first glance, we think it still could be a problem with stunnel.

Can you make more out of this data?

 

Do you know a way to log more information about the connections?

i.e. not all requests were accepted. What happened here?

 

2023.11.24 02:50:13 LOG7[359]:    354 server accept(s) requested

2023.11.24 02:50:13 LOG7[359]:    351 server accept(s) succeeded

 

 

Thank you in advance.

 

Best regards

Sasha

 

 

 


This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager.