Deadlock in cfi_cmdset_0001.c on simultaneous write operations.
Nicolas Pitre
nico at cam.org
Wed Nov 23 12:13:00 EST 2005
On Wed, 23 Nov 2005, Korolev, Alexey wrote:
> Hi All,
>
> I faced a halting issue on multi partitioned chip when I tried to
> execute simultaneous write operations.
What platform are you using? SMP or not? Using CONFIG_PREEMPT or not?
> Platform has halted, on execution of this sequence:
> dd if=random of=/dev/mtd4 bs=4k count=1k&
> dd if=random of=/dev/mtd5 bs=4k count=1k&
> dd if=random of=/dev/mtd6 bs=4k count=1k
>
> Halt didn't happens on two simultaneous write operations
> Execution of
> dd if=random of=/dev/mtd5 bs=4k count=1k&
> dd if=random of=/dev/mtd6 bs=4k count=1k
> was ok.
How big are your MTD partitions? Are they located on separate
_hardware_ partitions or do some of them share one of them?
> I made small investigation. Platform falls to deadlock in get_chip
> function.
> I was unable to definetly locate the place of the halt. But I gues it
> happened here.
Could you add some printk()'s to determine if the code is looping ...
> struct flchip_shared *shared = chip->priv;
> struct flchip *contender;
> spin_lock(&shared->lock);
> contender = shared->writing;
> if (contender && contender != chip) {
> int ret = spin_trylock(contender->mutex);
> spin_unlock(&shared->lock);
> if (!ret)
> goto retry;
^^^^^^^^^^^ here?
> spin_unlock(chip->mutex);
> ret = get_chip(map, contender, contender->start, mode);
> spin_lock(chip->mutex);
> if (ret) {
> spin_unlock(contender->mutex);
> return ret;
> }
> timeo = jiffies + HZ;
> spin_lock(&shared->lock);
> }
> shared->writing = chip;
> if (mode == FL_ERASING)
> shared->erasing = chip;
> if (contender && contender != chip)
> spin_unlock(contender->mutex);
> spin_unlock(&shared->lock);
>
> I slightly simplified functionality of the code and it helped, the
> following code doesn't halt
But that code is broken.
> struct flchip_shared *shared = chip->priv;
> struct flchip *contender;
>
> contender = shared->writing;
> if (contender && contender != chip) {
> yield();
> timeo = jiffies + HZ;
> goto retry;
> }
> /* We now own it */
^^^^^^^^^^^^^^^^^^^ wrong !
Nothing prevents another thread coming along and assigning itself the
ability to write at this point, just _after_ the current thread
determined that no contender was there but _before_ the shared lock is
acquired below.
> spin_lock(&shared->lock);
> shared->writing = chip;
And then the other thread sees its ownership overwritten.
Nicolas
More information about the linux-mtd
mailing list