[PATCH v10 06/14] mm: multi-gen LRU: minimal implementation

Barry Song 21cnbao at gmail.com
Mon Apr 18 02:58:26 PDT 2022


On Thu, Apr 7, 2022 at 3:16 PM Yu Zhao <yuzhao at google.com> wrote:
>
> To avoid confusion, the terms "promotion" and "demotion" will be
> applied to the multi-gen LRU, as a new convention; the terms
> "activation" and "deactivation" will be applied to the active/inactive
> LRU, as usual.
>
> The aging produces young generations. Given an lruvec, it increments
> max_seq when max_seq-min_seq+1 approaches MIN_NR_GENS. The aging
> promotes hot pages to the youngest generation when it finds them
> accessed through page tables; the demotion of cold pages happens
> consequently when it increments max_seq. The aging has the complexity
> O(nr_hot_pages), since it is only interested in hot pages. Promotion
> in the aging path does not require any LRU list operations, only the
> updates of the gen counter and lrugen->nr_pages[]; demotion, unless as
> the result of the increment of max_seq, requires LRU list operations,
> e.g., lru_deactivate_fn().
>
> The eviction consumes old generations. Given an lruvec, it increments
> min_seq when the lists indexed by min_seq%MAX_NR_GENS become empty. A
> feedback loop modeled after the PID controller monitors refaults over
> anon and file types and decides which type to evict when both types
> are available from the same generation.
>
> Each generation is divided into multiple tiers. Tiers represent
> different ranges of numbers of accesses through file descriptors. A
> page accessed N times through file descriptors is in tier
> order_base_2(N). Tiers do not have dedicated lrugen->lists[], only
> bits in folio->flags. In contrast to moving across generations, which
> requires the LRU lock, moving across tiers only involves operations on
> folio->flags. The feedback loop also monitors refaults over all tiers
> and decides when to protect pages in which tiers (N>1), using the
> first tier (N=0,1) as a baseline. The first tier contains single-use
> unmapped clean pages, which are most likely the best choices. The
> eviction moves a page to the next generation, i.e., min_seq+1, if the
> feedback loop decides so. This approach has the following advantages:
> 1. It removes the cost of activation in the buffered access path by
>    inferring whether pages accessed multiple times through file
>    descriptors are statistically hot and thus worth protecting in the
>    eviction path.
> 2. It takes pages accessed through page tables into account and avoids
>    overprotecting pages accessed multiple times through file
>    descriptors. (Pages accessed through page tables are in the first
>    tier, since N=0.)
> 3. More tiers provide better protection for pages accessed more than
>    twice through file descriptors, when under heavy buffered I/O
>    workloads.
>

Hi Yu,
As I told you before,  I tried to change the current LRU (not MGLRU) by only
promoting unmapped file pages to the head of the inactive head rather than
the active head on its second access:
https://lore.kernel.org/lkml/CAGsJ_4y=TkCGoWWtWSAptW4RDFUEBeYXwfwu=fUFvV4Sa4VA4A@mail.gmail.com/
I have already seen some very good results by the decease of cpu consumption of
kswapd and direct reclamation in the testing.

in mglru, it seems "twice" isn't a concern at all, one unmapped file
page accessed
twice has no much difference with those ones which are accessed once as you
only begin to increase refs from the third time:

+static void folio_inc_refs(struct folio *folio)
+{
+       unsigned long refs;
+       unsigned long old_flags, new_flags;
+
+       if (folio_test_unevictable(folio))
+               return;
+
+       /* see the comment on MAX_NR_TIERS */
+       do {
+               new_flags = old_flags = READ_ONCE(folio->flags);
+
+               if (!(new_flags & BIT(PG_referenced))) {
+                       new_flags |= BIT(PG_referenced);
+                       continue;
+               }
+
+               if (!(new_flags & BIT(PG_workingset))) {
+                       new_flags |= BIT(PG_workingset);
+                       continue;
+               }
+
+               refs = new_flags & LRU_REFS_MASK;
+               refs = min(refs + BIT(LRU_REFS_PGOFF), LRU_REFS_MASK);
+
+               new_flags &= ~LRU_REFS_MASK;
+               new_flags |= refs;
+       } while (new_flags != old_flags &&
+                cmpxchg(&folio->flags, old_flags, new_flags) != old_flags);
+}

So my question is what makes you so confident that twice doesn't need
any special treatment while the vanilla kernel is upgrading this kind of page
to the head of the active instead? I am asking this because I am considering
reclaiming unmapped file pages which are only accessed twice when they
get to the tail of the inactive list.

Thanks
Barry



More information about the linux-arm-kernel mailing list