Race-free NAND device removal

Boris Brezillon boris.brezillon at free-electrons.com
Mon Jul 4 03:06:30 PDT 2016


On Mon, 4 Jul 2016 11:44:03 +0200
Richard Weinberger <richard at nod.at> wrote:

> Am 04.07.2016 um 11:16 schrieb Boris Brezillon:
> > On Sun, 3 Jul 2016 15:38:42 +0200
> > Richard Weinberger <richard at nod.at> wrote:
> >   
> >> Hi!
> >>
> >> While working on nandsim I realized that nand_release() ignores the return
> >> value from mtd_device_unregister().
> >>
> >> That means NAND devices cannot removed in a race-free manner.
> >> Consider a NAND driver that registers ->_get_device() and ->_put_device()
> >> callbacks for refcounting. In its removal function it will return -EBUSY
> >> whenever the refcount is > 0.
> >> But when device is claimed while removing it, it can happen that the refcount
> >> increments after the check.
> >> MTD can deal with that and mtd_device_unregister() will return EBUSY.
> >> But nand_release() won't notice and the NAND driver continues with the tear down
> >> process.  
> > 
> > Yes, I already noticed that, and apparently all NAND controller drivers
> > seem to assume that nand_release() always succeed. It's definitely a
> > bug, since the MTD device will still be exposed, but the underlying
> > NAND structure (and the associated data + implementation) will be
> > gone :-/.  
> 
> Well, in most cases it will work since the module refcounting kicks in.
> And no NAND drivers create/remove MTDs during runtime.

Yep.

> 
> >>
> >> Would be a change like the following one acceptable or is a NAND driver
> >> allowed to call mtd_device_unregister() itself?
> >> AFAICT the additional call to mtd_device_unregister() in nand_release() would
> >> be an nop then.  
> > 
> > This patch looks good, but NAND controller drivers will keep ignoring
> > the nand_release() return code and release their own private data, so
> > implementations are still buggy ;).
> > 
> > This whole NAND dev registration/deregistration is unsafe, and I plan
> > to rework it when moving to a controller <-> chips infrastructure.
> > 
> > Are you fixing a real bug or just a potential one? Cause I'm not sure
> > doing that is any safer if we don't patch all the NAND controller
> > drivers...  
> 
> I'm facing a real issue on nandsim.
> Currently I'm heavily reworking nandsim.
> One of the new features is that you can add/remove NAND MTDs during runtime
> using a userspace tool. It works like losetup.
> 
> $ nandsimctl --backend file /home/rw/work/XXX/broken_mtd.raw --id-bytes 0x....
> 
> While getting this race free I found that issue.

Okay, so you modified nandsim code to check nand_release() return code,
right? Maybe you can send this change in your nandsim rework series
then.




More information about the linux-mtd mailing list