[PATCH] mm: Add Kcompressd for accelerated memory compression

Zi Yan ziy at nvidia.com
Wed May 7 08:12:27 PDT 2025


On 7 May 2025, at 11:00, Nhat Pham wrote:

> On Tue, May 6, 2025 at 7:04 PM Barry Song <21cnbao at gmail.com> wrote:
>>
>> On Wed, May 7, 2025 at 1:50 PM Zi Yan <ziy at nvidia.com> wrote:
>>>
>>> On 6 May 2025, at 21:12, Harry Yoo wrote:
>>>
>>>> On Wed, Apr 30, 2025 at 04:26:41PM +0800, Qun-Wei Lin wrote:
>>>>> This patch series introduces a new mechanism called kcompressd to
>>>>> improve the efficiency of memory reclaiming in the operating system.
>>>>>
>>>>> Problem:
>>>>>   In the current system, the kswapd thread is responsible for both scanning
>>>>>   the LRU pages and handling memory compression tasks (such as those
>>>>>   involving ZSWAP/ZRAM, if enabled). This combined responsibility can lead
>>>>>   to significant performance bottlenecks, especially under high memory
>>>>>   pressure. The kswapd thread becomes a single point of contention, causing
>>>>>   delays in memory reclaiming and overall system performance degradation.
>>>>>
>>>>> Solution:
>>>>>   Introduced kcompressd to handle asynchronous compression during memory
>>>>>   reclaim, improving efficiency by offloading compression tasks from
>>>>>   kswapd. This allows kswapd to focus on its primary task of page reclaim
>>>>>   without being burdened by the additional overhead of compression.
>>>>>
>>>>> In our handheld devices, we found that applying this mechanism under high
>>>>> memory pressure scenarios can increase the rate of pgsteal_anon per second
>>>>> by over 260% compared to the situation with only kswapd. Additionally, we
>>>>> observed a reduction of over 50% in page allocation stall occurrences,
>>>>> further demonstrating the effectiveness of kcompressd in alleviating memory
>>>>> pressure and improving system responsiveness.
>>>>>
>>>>> Co-developed-by: Barry Song <21cnbao at gmail.com>
>>>>> Signed-off-by: Barry Song <21cnbao at gmail.com>
>>>>> Signed-off-by: Qun-Wei Lin <qun-wei.lin at mediatek.com>
>>>>> Reference: Re: [PATCH 0/2] Improve Zram by separating compression context from kswapd - Barry Song
>>>>>          https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/
>>>>> ---
>>>>
>>>> +Cc Zi Yan, who might be interested in writing a framework (or improving
>>>> the existing one, padata) for parallelizing jobs (e.g. migration/compression)
>>>
>>> Thanks.
>>>
>>> I am currently looking into padata [1] to perform multithreaded page migration
>
> TIL about padata :)
>
>>> copy job. But based on this patch, it seems that kcompressed is just an additional
>>> kernel thread of executing zswap_store(). Is there any need for performing
>>> compression with multiple threads?
>>
>> The current focus is on enabling kswapd to perform asynchronous compression,
>> which can significantly reduce direct reclaim and allocstall events.
>> Therefore, the work begins with supporting a single thread. Supporting
>> multiple threads might be possible in the future, but it could be difficult
>> to control—especially on busy phones—since it consumes more power and may
>> interfere with other threads impacting user experience.
>
> Right, yeah.
>
>>
>>>
>>> BTW, I also notice that zswap IAA compress batching patchset[2] is using
>>> hardware accelerator (Intel Analytics Accelerator) to speed up zswap.
>>> I wonder if the handheld devices have similar hardware to get a similar benefit.
>>
>> Usually, the answer is no. We use zRAM and CPU, but this patch aims to provide
>> a common capability that can be shared by both zRAM and zswap.
>>
>
> Also, not everyone and every setup has access to hardware compression
> accelerators :) This provides benefits for all users.

Got it. Thanks for the explanation.


--
Best Regards,
Yan, Zi



More information about the Linux-mediatek mailing list