[PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

Li Qiang liqiang64 at huawei.com
Mon Nov 9 01:29:51 EST 2020



在 2020/11/6 2:21, Eric Biggers 写道:
> On Thu, Nov 05, 2020 at 05:05:53PM +0800, Li Qiang wrote:
>>
>>
>> 在 2020/11/5 15:51, Ard Biesheuvel 写道:
>>> Note that NEON intrinsics can be compiled for 32-bit ARM as well (with
>>> a bit of care - please refer to lib/raid6/recov_neon_inner.c for an
>>> example of how to deal with intrinsics that are only available on
>>> arm64) and are less error prone, so intrinsics should be preferred if
>>> feasible.
>>>
>>> However, you have still not explained how optimizing Adler32 makes a
>>> difference for a real-world use case. Where is libdeflate used on a
>>> hot path?
>>> .
>>
>> Sorry :(, I have not specifically searched for the use of this algorithm
>> in the kernel.
>>
>> When I used perf to test the performance of the libz library before,
>> I saw that the adler32 algorithm occupies a lot of hot spots.I just
>> saw this algorithm used in the kernel code, so I think optimizing this
>> algorithm may have some positive optimization effects on the kernel.:)
> 
> Adler32 performance is important for zlib compression/decompression, which has a
> few use cases in the kernel such as btrfs compression.  However, these days
> those few kernel use cases are mostly switching to newer algorithms like lz4 and
> zstd.  Also as I mentioned, your patch doesn't actually wire up your code to be
> used by the kernel's implementation of zlib compression/decompression.
> 
> I think you'd be much better off contributing to a userspace project, where
> DEFLATE/zlib/gzip support still has a long tail of use cases.  The official zlib
> isn't really being maintained and isn't accepting architecture-specific
> optimizations, but there are some performance-oriented forks of zlib (e.g.
> https://chromium.googlesource.com/chromium/src/third_party/zlib/ and
> https://github.com/zlib-ng/zlib-ng), as well as other projects like libdeflate
> (https://github.com/ebiggers/libdeflate).  Generally I'm happy to accept
> architecture-specific optimizations in libdeflate, but they need to be testable.
> 
> - Eric
> .
> 

Thank you for your answers and suggestions. I have not seen these repositories
before. Regarding the SVE implementation of adler32, I will focus on the
repositories you mentioned later.:)

-- 
Best regards,
Li Qiang



More information about the linux-arm-kernel mailing list