Hang on reboot in nand_get_device()

Andrew E. Mileski andrewm at isoar.ca
Mon Nov 9 10:43:41 PST 2015


On 2015-11-06 13:00, Brian Norris wrote:
>
> On Thu, Jul 02, 2015 at 03:21:48PM -0400, Andrew E. Mileski wrote:
>> I'm experiencing a hang on reboot with a Freescale P1022 PowerPC system, with a
>> dual chip-select NAND part (specified in the device tree as two
>> separate devices), and kernel v4.0.6.
>
> Which driver?

Sorry, I wasn't specific enough:  fsl_elbc_nand.c

Currently using the 4.2.0 Linux kernel version, and the problem persists.

I've actually made a small hack to it to enable software ECC, but it isn't fit 
for distribution.

The stock driver is rather ugly in my opinion, and makes some overly broad 
assumptions about the hardware; the driver really is only intended to support 
hardware ECC.  I've taken a few stabs at cleaning-up the stock driver, but they 
all end-up being complete re-writes (for multi-die support), which is a bit more 
than I can chew.

The driver otherwise is usable as-is, albeit with only hardware ECC (that is 
insufficient for many modern NAND devices), and exhibits the same problem.

>> It appears to be a hang in mtd/nand/nand_base.c:nand_get_device() waiting on
>> chip->controller->active.
>
> And now that I begin describing the problem and grepping through the
> source logs... I see that this problem was already resolved back in
> 2009, except for the FL_PM_SUSPENDED mode:
>
> commit 6b0d9a84124937f048bcb8b21313152b23063978
> Author: Li Yang <leoli at freescale.com>
> Date:   Tue Nov 17 14:45:49 2009 -0800
>
>      mtd: nand: fix multi-chip suspend problem
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6b0d9a84124937f048bcb8b21313152b23063978
>
> I actually don't see why we can't just use
> nand_get_device(FL_PM_SUSPENDED) for the shutdown/reboot case, like
> this:
>
> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
> index cc74142938b0..ece544efccc3 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3110,7 +3110,7 @@ static void nand_resume(struct mtd_info *mtd)
>    */
>   static void nand_shutdown(struct mtd_info *mtd)
>   {
> -	nand_get_device(mtd, FL_SHUTDOWN);
> +	nand_get_device(mtd, FL_PM_SUSPENDED);
>   }
>
>   /* Set default functions */
>
>
> It's also possible that this could be better solved in a proper
> refactor/rewrite of the NAND subsystem using a better controller/chip
> split, so there's only one reboot handler per NAND controller. Boris has
> been looking at that.
>
> Anyway, if the above looks OK, I can send a proper patch and get your
> 'Tested-by's.
>
> Brian

That seems to work too!  Thanks!

~~Andrew E. Mileski



More information about the linux-mtd mailing list