[PATCH] mm: Add Kcompressd for accelerated memory compression

Nhat Pham nphamcs at gmail.com
Wed May 7 08:00:59 PDT 2025


On Tue, May 6, 2025 at 7:04 PM Barry Song <21cnbao at gmail.com> wrote:
>
> On Wed, May 7, 2025 at 1:50 PM Zi Yan <ziy at nvidia.com> wrote:
> >
> > On 6 May 2025, at 21:12, Harry Yoo wrote:
> >
> > > On Wed, Apr 30, 2025 at 04:26:41PM +0800, Qun-Wei Lin wrote:
> > >> This patch series introduces a new mechanism called kcompressd to
> > >> improve the efficiency of memory reclaiming in the operating system.
> > >>
> > >> Problem:
> > >>   In the current system, the kswapd thread is responsible for both scanning
> > >>   the LRU pages and handling memory compression tasks (such as those
> > >>   involving ZSWAP/ZRAM, if enabled). This combined responsibility can lead
> > >>   to significant performance bottlenecks, especially under high memory
> > >>   pressure. The kswapd thread becomes a single point of contention, causing
> > >>   delays in memory reclaiming and overall system performance degradation.
> > >>
> > >> Solution:
> > >>   Introduced kcompressd to handle asynchronous compression during memory
> > >>   reclaim, improving efficiency by offloading compression tasks from
> > >>   kswapd. This allows kswapd to focus on its primary task of page reclaim
> > >>   without being burdened by the additional overhead of compression.
> > >>
> > >> In our handheld devices, we found that applying this mechanism under high
> > >> memory pressure scenarios can increase the rate of pgsteal_anon per second
> > >> by over 260% compared to the situation with only kswapd. Additionally, we
> > >> observed a reduction of over 50% in page allocation stall occurrences,
> > >> further demonstrating the effectiveness of kcompressd in alleviating memory
> > >> pressure and improving system responsiveness.
> > >>
> > >> Co-developed-by: Barry Song <21cnbao at gmail.com>
> > >> Signed-off-by: Barry Song <21cnbao at gmail.com>
> > >> Signed-off-by: Qun-Wei Lin <qun-wei.lin at mediatek.com>
> > >> Reference: Re: [PATCH 0/2] Improve Zram by separating compression context from kswapd - Barry Song
> > >>           https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/
> > >> ---
> > >
> > > +Cc Zi Yan, who might be interested in writing a framework (or improving
> > > the existing one, padata) for parallelizing jobs (e.g. migration/compression)
> >
> > Thanks.
> >
> > I am currently looking into padata [1] to perform multithreaded page migration

TIL about padata :)

> > copy job. But based on this patch, it seems that kcompressed is just an additional
> > kernel thread of executing zswap_store(). Is there any need for performing
> > compression with multiple threads?
>
> The current focus is on enabling kswapd to perform asynchronous compression,
> which can significantly reduce direct reclaim and allocstall events.
> Therefore, the work begins with supporting a single thread. Supporting
> multiple threads might be possible in the future, but it could be difficult
> to control—especially on busy phones—since it consumes more power and may
> interfere with other threads impacting user experience.

Right, yeah.

>
> >
> > BTW, I also notice that zswap IAA compress batching patchset[2] is using
> > hardware accelerator (Intel Analytics Accelerator) to speed up zswap.
> > I wonder if the handheld devices have similar hardware to get a similar benefit.
>
> Usually, the answer is no. We use zRAM and CPU, but this patch aims to provide
> a common capability that can be shared by both zRAM and zswap.
>

Also, not everyone and every setup has access to hardware compression
accelerators :) This provides benefits for all users.



More information about the linux-arm-kernel mailing list