Race condition in build_all_zonelists() when offlining movable zone

Mel Gorman mgorman at suse.de
Wed Aug 24 02:45:11 PDT 2022


On Tue, Aug 23, 2022 at 05:51:25PM +0200, David Hildenbrand wrote:
> >> The race is simple -- page allocation could be in progress when a memory
> >> hot-remove operation triggers a zonelist rebuild that removes zones.
> >> The allocation request will still have a valid ac->preferred_zoneref that
> >> is now pointing to NULL and triggers an OOM kill.
> >>
> >> This problem probably always existed but may be slighly easier to trigger
> 
> s/slighly/slightly/
> 

Fixed.

> >> due to 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones
> >> with pages managed by the buddy allocator") which distinguishes between
> >> zones that are completely unpopulated versus zones that have valid pages
> >> but they are all reserved. Memory hotplug had multiple stages with
> 
> Not necessarily reserved, simply not managed by the buddy (e.g., early
> allocations, memory ballooning / virtio-mem).
> 

Fair point, I filed all that under "reserved" but you're right, this is
clearer.

> >> +/*
> >> + * Zonelists may change due to hotplug during allocation. Detect when zonelists
> >> + * have been rebuilt so allocation retries. Reader side does not lock and
> >> + * retries the allocation if zonelist changes. Writer side is protected by the
> >> + * embedded spin_lock.
> >> + */
> >> +DEFINE_SEQLOCK(zonelist_update_seq);
> >> +
> >> +static unsigned int zonelist_iter_begin(void)
> >> +{
> > 
> > You likely want something like
> > 	if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> > 		return read_seqbegin(&zonelist_update_seq);
> > 	return 0;
> > 
> >> +	return read_seqbegin(&zonelist_update_seq);
> >> +}
> >> +
> >> +static unsigned int check_retry_zonelist(unsigned int seq)
> >> +{
> > 	if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> > 		return read_seqretry(&zonelist_update_seq, seq);
> > 	return seq;
> > 
> >> +	return read_seqretry(&zonelist_update_seq, seq);
> >> +}
> >> +
> > 
> > to avoid overhead on systems without HOTREMOVE configured.
> > 
> > Other than that LGTM.
> > Acked-by: Michal Hocko <mhocko at suse.com>
> 

It's a minor saving but fair enough. HOTREMOVE is now the only reason
zonelists can be rebuilt. While that was not always true, if it ever
changes again, it's a simple fix.

Thanks Michal.

> Makes sense to me, although I wonder how much it will matter in practice.
> 

Probably none at all as it's one branch but it's still valid.

> Reviewed-by: David Hildenbrand <david at redhat.com>
> 

Thanks David.

-- 
Mel Gorman
SUSE Labs



More information about the linux-arm-kernel mailing list