I'm testing Stunnel with Ixia traffic generator (Breakingpoint Virtual Edition) and I got a limited performance of 3.5Gbps full duplex even with 6 or 9 cpu in the esxi vm. The processors are fully dedicated as well memory all reserved with passthrough od network cards intel x540 10Gbps, 2 ports. The 2 vm are similar with one running stunnel client and the other as stunnel server.
The page performance of Stunnel of 5.5Gbps but doesn't inform if it's half or full since in the operation the stunnel has to deal with 2 flow coming and going.
If it was 5.5Gbps total makes sense since I'm using 2 x Scalable Gold 6140 and gave me 7Gbps encryption + decryption with AES-NI I think since we are not a feasible solution to check on the fly only the overall performance.
I'm using the same cipher suit of performance information of Stunnel. 
Does anyone know if the performance numbers is considering the total performance or full duplex?
Does someone have any tip to improve the performance? 
Thanks, 
Luis Monteiro