[PATCH v6 6/9] mm: multigenerational lru: aging

Yu Zhao yuzhao at google.com
Fri Jan 7 15:36:11 PST 2022


On Fri, Jan 07, 2022 at 02:11:29PM +0100, Michal Hocko wrote:
> On Tue 04-01-22 13:22:25, Yu Zhao wrote:
> [...]
> > +static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc)
> > +{
> > +	struct mem_cgroup *memcg;
> > +	bool success = false;
> > +	unsigned long min_ttl = READ_ONCE(lru_gen_min_ttl);
> > +
> > +	VM_BUG_ON(!current_is_kswapd());
> > +
> > +	current->reclaim_state->mm_walk = &pgdat->mm_walk;
> > +
> > +	memcg = mem_cgroup_iter(NULL, NULL, NULL);
> > +	do {
> > +		struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
> > +
> > +		if (age_lruvec(lruvec, sc, min_ttl))
> > +			success = true;
> > +
> > +		cond_resched();
> > +	} while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)));
> > +
> > +	if (!success && mutex_trylock(&oom_lock)) {
> > +		struct oom_control oc = {
> > +			.gfp_mask = sc->gfp_mask,
> > +			.order = sc->order,
> > +		};
> > +
> > +		if (!oom_reaping_in_progress())
> > +			out_of_memory(&oc);
> > +
> > +		mutex_unlock(&oom_lock);
> > +	}
> 
> Why do you need to trigger oom killer from this path? Why cannot you
> rely on the page allocator to do that like we do now?

This is per desktop users' (repeated) requests. The can't tolerate
thrashing as servers do because of UI lags; and they usually don't
have fancy tools like oomd.

Related discussions I saw:
https://github.com/zen-kernel/zen-kernel/issues/218
https://lore.kernel.org/lkml/20101028191523.GA14972@google.com/
https://lore.kernel.org/lkml/20211213051521.21f02dd2@mail.inbox.lv/
https://lore.kernel.org/lkml/54C2C89C.8080002@gmail.com/
https://lore.kernel.org/lkml/d9802b6a-949b-b327-c4a6-3dbca485ec20@gmx.com/

>From patch 8:
  Personal computers
  ------------------
  :Thrashing prevention: Write ``N`` to
   ``/sys/kernel/mm/lru_gen/min_ttl_ms`` to prevent the working set of
   ``N`` milliseconds from getting evicted. The OOM killer is invoked if
   this working set can't be kept in memory. Based on the average human
   detectable lag (~100ms), ``N=1000`` usually eliminates intolerable
   lags due to thrashing. Larger values like ``N=3000`` make lags less
   noticeable at the cost of more OOM kills.



More information about the linux-arm-kernel mailing list