[PATCH v4] crypto: riscv/poly1305 - import OpenSSL/CRYPTOGAMS implementation
Zhihang Shao
zhihang.shao.iscas at gmail.com
Sun Jul 20 02:10:27 PDT 2025
Hi Eric,
I recently ran a test using the Kunit module you wrote for testing
poly1305, which I executed on QEMU RISC-V 64, . The results show a
significant performance improvement of the optimized implementation
compared to the generic one. The test data are as follows:
--- base.log 2025-07-19 17:41:06.443392989 +0800
+++ optimized.log 2025-07-19 17:40:45.650048601 +0800
@@ -1,31 +1,31 @@
-[ 0.668631] # Subtest: poly1305
-[ 0.668774] # module: poly1305_kunit
-[ 0.668857] 1..12
-[ 0.670267] ok 1 test_hash_test_vectors
-[ 0.679479] ok 2 test_hash_all_lens_up_to_4096
-[ 0.696048] ok 3 test_hash_incremental_updates
-[ 0.697645] ok 4 test_hash_buffer_overruns
-[ 0.701060] ok 5 test_hash_overlaps
-[ 0.702858] ok 6 test_hash_alignment_consistency
-[ 0.703108] ok 7 test_hash_ctx_zeroization
-[ 0.846150] ok 8 test_hash_interrupt_context_1
-[ 1.235247] ok 9 test_hash_interrupt_context_2
-[ 1.250813] ok 10 test_poly1305_allones_keys_and_message
-[ 1.251138] ok 11 test_poly1305_reduction_edge_cases
-[ 1.287196] # benchmark_hash: len=1: 2 MB/s
-[ 1.305363] # benchmark_hash: len=16: 61 MB/s
-[ 1.321102] # benchmark_hash: len=64: 212 MB/s
-[ 1.340105] # benchmark_hash: len=127: 263 MB/s
-[ 1.353880] # benchmark_hash: len=128: 364 MB/s
-[ 1.370118] # benchmark_hash: len=200: 377 MB/s
-[ 1.381879] # benchmark_hash: len=256: 570 MB/s
-[ 1.394125] # benchmark_hash: len=511: 657 MB/s
-[ 1.404265] # benchmark_hash: len=512: 794 MB/s
-[ 1.413356] # benchmark_hash: len=1024: 985 MB/s
-[ 1.421925] # benchmark_hash: len=3173: 1131 MB/s
-[ 1.429956] # benchmark_hash: len=4096: 1218 MB/s
-[ 1.438184] # benchmark_hash: len=16384: 1216 MB/s
-[ 1.438462] ok 12 benchmark_hash
-[ 1.438686] # poly1305: pass:12 fail:0 skip:0 total:12
-[ 1.438763] # Totals: pass:12 fail:0 skip:0 total:12
-[ 1.438904] ok 1 poly1305
+[ 0.666280] # Subtest: poly1305
+[ 0.666413] # module: poly1305_kunit
+[ 0.666490] 1..12
+[ 0.667702] ok 1 test_hash_test_vectors
+[ 0.672896] ok 2 test_hash_all_lens_up_to_4096
+[ 0.686244] ok 3 test_hash_incremental_updates
+[ 0.687263] ok 4 test_hash_buffer_overruns
+[ 0.689957] ok 5 test_hash_overlaps
+[ 0.691393] ok 6 test_hash_alignment_consistency
+[ 0.691622] ok 7 test_hash_ctx_zeroization
+[ 0.769741] ok 8 test_hash_interrupt_context_1
+[ 0.930832] ok 9 test_hash_interrupt_context_2
+[ 0.940068] ok 10 test_poly1305_allones_keys_and_message
+[ 0.940478] ok 11 test_poly1305_reduction_edge_cases
+[ 0.964546] # benchmark_hash: len=1: 3 MB/s
+[ 0.978836] # benchmark_hash: len=16: 78 MB/s
+[ 0.990414] # benchmark_hash: len=64: 289 MB/s
+[ 1.003012] # benchmark_hash: len=127: 397 MB/s
+[ 1.012755] # benchmark_hash: len=128: 517 MB/s
+[ 1.022928] # benchmark_hash: len=200: 603 MB/s
+[ 1.030981] # benchmark_hash: len=256: 835 MB/s
+[ 1.038706] # benchmark_hash: len=511: 1046 MB/s
+[ 1.045233] # benchmark_hash: len=512: 1240 MB/s
+[ 1.050733] # benchmark_hash: len=1024: 1638 MB/s
+[ 1.055620] # benchmark_hash: len=3173: 1998 MB/s
+[ 1.060247] # benchmark_hash: len=4096: 2132 MB/s
+[ 1.064695] # benchmark_hash: len=16384: 2267 MB/s
+[ 1.065179] ok 12 benchmark_hash
+[ 1.065425] # poly1305: pass:12 fail:0 skip:0 total:12
+[ 1.065498] # Totals: pass:12 fail:0 skip:0 total:12
+[ 1.065612] ok 1 poly1305
Next, I plan to validate this performance gain on actual RISC-V
hardware. I will also submit a v5 patch to the mailing list.
Look forward to your feedback and suggestions.
- Zhihang
More information about the linux-riscv
mailing list