[PATCH] mm/migrate: fix race between lock page and clear PG_Isolated

Andrew Morton akpm at linux-foundation.org
Mon Mar 14 21:21:27 PDT 2022


On Tue, 15 Mar 2022 11:05:15 +0800 Andrew Yang <andrew.yang at mediatek.com> wrote:

> When memory is tight, system may start to compact memory for large
> continuous memory demands. If one process tries to lock a memory page
> that is being locked and isolated for compaction, it may wait a long time
> or even forever. This is because compaction will perform non-atomic
> PG_Isolated clear while holding page lock, this may overwrite PG_waiters
> set by the process that can't obtain the page lock and add itself to the
> waiting queue to wait for the lock to be unlocked.
> 
> CPU1                            CPU2
> lock_page(page); (successful)
>                                 lock_page(); (failed)
> __ClearPageIsolated(page);      SetPageWaiters(page) (may be overwritten)
> unlock_page(page);
> 
> The solution is to not perform non-atomic operation on page flags while
> holding page lock.

Sure, the non-atomic bitop optimization is really risky and I suspect
we reach for it too often.  Or at least without really clearly
demonstrating that it is safe, and documenting our assumptions.

I'm thinking this one should be backported, so I'll queue it for
5.18-rc1, with a cc:stable so it gets trickled back.



More information about the linux-arm-kernel mailing list