[RFC PATCH] mtd: rawnand: use mutex to protect access while in suspend
Boris Brezillon
boris.brezillon at collabora.com
Mon Oct 4 01:41:47 PDT 2021
On Mon, 4 Oct 2021 08:56:09 +0200
Sean Nyekjaer <sean at geanix.com> wrote:
> This will prevent nand_get_device() from returning -EBUSY.
> It will force mtd_write()/mtd_read() to wait for the nand_resume() to unlock
> access to the mtd device.
>
> Then we avoid -EBUSY is returned to ubifsi via mtd_write()/mtd_read(),
> that will in turn hard error on every error returened.
> We have seen during ubifs tries to call mtd_write before the mtd device
> is resumed.
I think the problem is here. Why would UBIFS/UBI try to write something
to a device that's not resumed yet (or has been suspended already, if
you hit this in the suspend path).
>
> Exec_op[0] speed things up, so we see this race when the device is
> resuming. But it's actually "mtd: rawnand: Simplify the locking" that
> allows it to return -EBUSY, before that commit it would have waited for
> the mtd device to resume.
Uh, wait. If nand_resume() was called before any writes/reads this
wouldn't happen. IMHO, the problem is not that we return -EBUSY without
blocking, the problem is that someone issues a write/read before calling
mtd_resume().
>
> Tested on a iMX6ULL.
>
> [0]:
> ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
>
> Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
> Signed-off-by: Sean Nyekjaer <sean at geanix.com>
> ---
>
> I did this a RFC as we probably will need to remove the suspended
> variable as it's kinda made obsolute by this change.
> Should we introduce a new mutex? Or maybe a spin_lock?
>
> drivers/mtd/nand/raw/nand_base.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> index 3d6c6e880520..0ea343404cac 100644
> --- a/drivers/mtd/nand/raw/nand_base.c
> +++ b/drivers/mtd/nand/raw/nand_base.c
> @@ -4567,7 +4567,6 @@ static int nand_suspend(struct mtd_info *mtd)
> ret = chip->ops.suspend(chip);
> if (!ret)
> chip->suspended = 1;
> - mutex_unlock(&chip->lock);
Hm, I'm not sure keeping the lock when you're in a suspended state
is a good idea. It just papers over another bug IMO (see above).
>
> return ret;
> }
> @@ -4580,7 +4579,6 @@ static void nand_resume(struct mtd_info *mtd)
> {
> struct nand_chip *chip = mtd_to_nand(mtd);
>
> - mutex_lock(&chip->lock);
> if (chip->suspended) {
> if (chip->ops.resume)
> chip->ops.resume(chip);
More information about the linux-mtd
mailing list