[RFC PATCH] mtd: rawnand: use mutex to protect access while in suspend

Boris Brezillon boris.brezillon at collabora.com
Mon Oct 4 01:41:47 PDT 2021


On Mon,  4 Oct 2021 08:56:09 +0200
Sean Nyekjaer <sean at geanix.com> wrote:

> This will prevent nand_get_device() from returning -EBUSY.
> It will force mtd_write()/mtd_read() to wait for the nand_resume() to unlock
> access to the mtd device.
> 
> Then we avoid -EBUSY is returned to ubifsi via mtd_write()/mtd_read(),
> that will in turn hard error on every error returened.
> We have seen during ubifs tries to call mtd_write before the mtd device
> is resumed.

I think the problem is here. Why would UBIFS/UBI try to write something
to a device that's not resumed yet (or has been suspended already, if
you hit this in the suspend path).

> 
> Exec_op[0] speed things up, so we see this race when the device is
> resuming. But it's actually "mtd: rawnand: Simplify the locking" that
> allows it to return -EBUSY, before that commit it would have waited for
> the mtd device to resume.

Uh, wait. If nand_resume() was called before any writes/reads this
wouldn't happen. IMHO, the problem is not that we return -EBUSY without
blocking, the problem is that someone issues a write/read before calling
mtd_resume().

> 
> Tested on a iMX6ULL.
> 
> [0]:
> ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
> 
> Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
> Signed-off-by: Sean Nyekjaer <sean at geanix.com>
> ---
> 
> I did this a RFC as we probably will need to remove the suspended
> variable as it's kinda made obsolute by this change.
> Should we introduce a new mutex? Or maybe a spin_lock?
> 
>  drivers/mtd/nand/raw/nand_base.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..0ea343404cac 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -4567,7 +4567,6 @@ static int nand_suspend(struct mtd_info *mtd)
>  		ret = chip->ops.suspend(chip);
>  	if (!ret)
>  		chip->suspended = 1;
> -	mutex_unlock(&chip->lock);

Hm, I'm not sure keeping the lock when you're in a suspended state
is a good idea. It just papers over another bug IMO (see above).

>  
>  	return ret;
>  }
> @@ -4580,7 +4579,6 @@ static void nand_resume(struct mtd_info *mtd)
>  {
>  	struct nand_chip *chip = mtd_to_nand(mtd);
>  
> -	mutex_lock(&chip->lock);
>  	if (chip->suspended) {
>  		if (chip->ops.resume)
>  			chip->ops.resume(chip);




More information about the linux-mtd mailing list