Race condition in build_all_zonelists() when offlining movable zone
Mel Gorman
mgorman at suse.de
Wed Aug 24 02:45:11 PDT 2022
On Tue, Aug 23, 2022 at 05:51:25PM +0200, David Hildenbrand wrote:
> >> The race is simple -- page allocation could be in progress when a memory
> >> hot-remove operation triggers a zonelist rebuild that removes zones.
> >> The allocation request will still have a valid ac->preferred_zoneref that
> >> is now pointing to NULL and triggers an OOM kill.
> >>
> >> This problem probably always existed but may be slighly easier to trigger
>
> s/slighly/slightly/
>
Fixed.
> >> due to 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones
> >> with pages managed by the buddy allocator") which distinguishes between
> >> zones that are completely unpopulated versus zones that have valid pages
> >> but they are all reserved. Memory hotplug had multiple stages with
>
> Not necessarily reserved, simply not managed by the buddy (e.g., early
> allocations, memory ballooning / virtio-mem).
>
Fair point, I filed all that under "reserved" but you're right, this is
clearer.
> >> +/*
> >> + * Zonelists may change due to hotplug during allocation. Detect when zonelists
> >> + * have been rebuilt so allocation retries. Reader side does not lock and
> >> + * retries the allocation if zonelist changes. Writer side is protected by the
> >> + * embedded spin_lock.
> >> + */
> >> +DEFINE_SEQLOCK(zonelist_update_seq);
> >> +
> >> +static unsigned int zonelist_iter_begin(void)
> >> +{
> >
> > You likely want something like
> > if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> > return read_seqbegin(&zonelist_update_seq);
> > return 0;
> >
> >> + return read_seqbegin(&zonelist_update_seq);
> >> +}
> >> +
> >> +static unsigned int check_retry_zonelist(unsigned int seq)
> >> +{
> > if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> > return read_seqretry(&zonelist_update_seq, seq);
> > return seq;
> >
> >> + return read_seqretry(&zonelist_update_seq, seq);
> >> +}
> >> +
> >
> > to avoid overhead on systems without HOTREMOVE configured.
> >
> > Other than that LGTM.
> > Acked-by: Michal Hocko <mhocko at suse.com>
>
It's a minor saving but fair enough. HOTREMOVE is now the only reason
zonelists can be rebuilt. While that was not always true, if it ever
changes again, it's a simple fix.
Thanks Michal.
> Makes sense to me, although I wonder how much it will matter in practice.
>
Probably none at all as it's one branch but it's still valid.
> Reviewed-by: David Hildenbrand <david at redhat.com>
>
Thanks David.
--
Mel Gorman
SUSE Labs
More information about the linux-arm-kernel
mailing list