[RFC PATCH 1/2] mtd: nand: schedule() after releasing the device
Sebastian Andrzej Siewior
bigeasy at linutronix.de
Mon Nov 23 13:09:06 EST 2015
I have here a live lock in UBI doing
ensure_wear_leveling() -> wear_leveling_worker() -> ubi_eba_copy_leb()
MOVE_RETRY -> schedule_erase() -> ensure_wear_leveling()
on the same PEB over and over again. The reason for MOVE_RETRY is that
the LEB-Lock owner is stucked in nand_get_device() and does not get the
device lock. The PEB-lock owner is only scheduled on the CPU while the UBI
thread is idle during erase or read while (again) owning the device-lock
so the LEB-lock owner makes no progress.
To fix this live lock the patch adds a schedule() invocation if the wait
queue for the nand-device lock is not empty so the waiter can grab the
lock and make progress.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy at linutronix.de>
---
drivers/mtd/nand/nand_base.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index ece544efccc3..3dc2dff01802 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -133,13 +133,23 @@ static int check_offs_len(struct mtd_info *mtd,
static void nand_release_device(struct mtd_info *mtd)
{
struct nand_chip *chip = mtd->priv;
+ bool do_sched = false;
/* Release the controller and the chip */
spin_lock(&chip->controller->lock);
chip->controller->active = NULL;
chip->state = FL_READY;
+ /*
+ * Check if we have a waiter. If so we will schedule() right away so the
+ * waiter can grab the device while it is released and not after _this_
+ * caller gained the device (again) without leaving the CPU in between.
+ */
+ if (waitqueue_active(&chip->controller->wq))
+ do_sched = true;
wake_up(&chip->controller->wq);
spin_unlock(&chip->controller->lock);
+ if (do_sched)
+ schedule();
}
/**
--
2.6.2
More information about the linux-mtd
mailing list