[PATCH] zsmalloc: Fix races between modifications of fullness and isolated

Andrew Yang (楊智強) Andrew.Yang at mediatek.com
Tue Jul 25 23:59:20 PDT 2023


On Wed, 2023-07-26 at 12:18 +0900, Sergey Senozhatsky wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On (23/07/21 14:37), Andrew Yang wrote:
> > 
> > Since fullness and isolated share the same unsigned int,
> > modifications of them should be protected by the same lock.
> > 
> > Signed-off-by: Andrew Yang <andrew.yang at mediatek.com>
> > Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for
> migration")
> 
> Have you observed issues in real life? That commit is more than a
> year
> and a half old, so I wonder.
> 
Yes, we encountered many kernel exceptions of
VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and
BUG_ON(!pages[1]) in zs_unmap_object() lately.
This issue only occurs when migration and reclamation occur at the
same time. With our memory stress test, we can reproduce this issue
several times a day. We have no idea why no one else encountered
this issue. BTW, we switched to the new kernel version with this
defect a few months ago.
> > @@ -1858,8 +1860,8 @@ static int zs_page_migrate(struct page
> *newpage, struct page *page,
> >   * Since we complete the data copy and set up new zspage
> structure,
> >   * it's okay to release the pool's lock.
> >   */
> 
> This comment should be moved too, because this is not where we unlock
> the
> pool anymore.
> 
Okay, I will submit a new patch later.
> > -spin_unlock(&pool->lock);
> >  dec_zspage_isolation(zspage);
> > +spin_unlock(&pool->lock);
> >  migrate_write_unlock(zspage);


More information about the Linux-mediatek mailing list