A kernel module that stress-tests the crypto API. In the first table, we submit one request at a time - the next request will be submitted when the previous finished. QAT underperforms for requests smaller than 64k. For larger requests, it performs better. single request: THROUGHPUT bytes/s REQUEST SIZE QAT AES-NI 16: 978080 357677552 32: 2427808 631892384 64: 4019328 851706944 128: 9846656 1075419648 256: 19770880 1214184960 512: 37902336 1314044928 1024: 61189120 1364847616 2048: 146108416 1391562752 4096: 270344192 1405132800 8192: 479846400 1412497408 16384: 658407424 1415512064 32768: 1107525632 1409187840 65536: 1495072768 1417543680 131072: 1802895360 1411514368 262144: 2019819520 1418461184 524288: 2155347968 1419247616 1048576: 2216689664 1419771904 2097152: 2246049792 1419771904 4194304: 2264924160 1421869056 We run 112 concurrent threads, each of them is submitting one crypto request at a time. QAT underperforms by a factor of ten. The reason is that there are 56 cores with AES-NI and only 2 QAT cores - so, even if one QAT core is faster that one AES-NI core, the quantity of AES-NI cores just makes AES-NI win. 112 concurrent requests: THROUGHPUT bytes/s REQUEST SIZE QAT AES-NI 16: 11627522 12526961141 32: 25887969 23364631478 64: 51547249 37742683288 128: 98184715 55630264212 256: 194229221 72237997600 512: 421332661 84991642915 1024: 785504214 93236052844 2048: 1652392658 98019030678 4096: 3171583033 97251437539 8192: 5879683878 99869242625 16384: 5301093806 101190068330 32768: 6831875452 101943556942 65536: 7475811005 101965687867 131072: 8188031765 102472310301 262144: 8575568568 102368093506 524288: 8731863820 101478581102 1048576: 8702213733 100123092114 2097152: 8572197063 96390151699 4194304: 8678727841 83703855556 This table shows CPU consumption of a single-threaded workload - i.e. how many bytes of data can we encrypt per one second of CPU time. For AES-NI, it is very similar to the first table because AES-NI just consumes 100% CPU time when encrypting. For QAT we see increasing numbers because the CPU consumption is basically independent on the request size. So, for large requests, we consume as much CPU as for the small requests. CPU CONSUMPTION bytes/cpu_seconds REQUEST SIZE QAT AES-NI 16: 3223674 351026478 32: 7667255 621317005 64: 13228375 844430194 128: 31404379 1056636578 256: 56435200 1196274553 512: 107753129 1290195390 1024: 267100504 1347276800 2048: 601031559 1373520601 4096: 1029135753 1387371976 8192: 1944426773 1388917841 16384: 4356786745 1397754925 32768: 7339146378 1399986793 65536: 16185902545 1393857839 131072: 32360740571 1401872253 262144: 54632226594 1401302431 524288: 107793612800 1393775524 1048576: 122974663111 1400523786 2097152: 187170816000 1402951040 4194304: 174224935384 1403638368