Deadlock in cfi_cmdset_0001.c on simultaneous write operations.

Nicolas Pitre nico at cam.org
Wed Nov 23 17:02:40 EST 2005


On Wed, 23 Nov 2005, Korolev, Alexey wrote:

> Hi All,
>  
> I faced a halting issue on multi partitioned chip when I tried to
> execute simultaneous write operations.
> Platform has halted, on execution of this sequence:
> dd if=random of=/dev/mtd4 bs=4k count=1k&
> dd if=random of=/dev/mtd5 bs=4k count=1k&
> dd if=random of=/dev/mtd6 bs=4k count=1k
> 
> Halt didn't happens on two simultaneous write operations
> Execution of 
> dd if=random of=/dev/mtd5 bs=4k count=1k&
> dd if=random of=/dev/mtd6 bs=4k count=1k
> was ok.

I tested the above and cannot reproduce any hang whatsoever, even with 
additional parallel writes.

> I made small investigation. Platform falls to deadlock in get_chip
> function.
> I was unable to definetly locate the place of the halt. But I gues it
> happened here.
>  
>   struct flchip_shared *shared = chip->priv;
>   struct flchip *contender;
>   spin_lock(&shared->lock);
>   contender = shared->writing;
>   if (contender && contender != chip) {
>    int ret = spin_trylock(contender->mutex);
>    spin_unlock(&shared->lock);
>    if (!ret)
>     goto retry;
>    spin_unlock(chip->mutex);
>    ret = get_chip(map, contender, contender->start, mode);
>    spin_lock(chip->mutex);
>    if (ret) {
>     spin_unlock(contender->mutex);
>     return ret;
>    }
>    timeo = jiffies + HZ;
>    spin_lock(&shared->lock);
>   }
>   shared->writing = chip;
>   if (mode == FL_ERASING)
>    shared->erasing = chip;
>   if (contender && contender != chip)
>    spin_unlock(contender->mutex);
>   spin_unlock(&shared->lock);

Actually, the above is a bit too strict. I don't know if the following 
patch will fix your problem, but it certainly makes it look more correct 
from a theoretical point of view.

diff --git a/drivers/mtd/chips/cfi_cmdset_0001.c b/drivers/mtd/chips/cfi_cmdset_0001.c
index 143f01a..a4b07e2 100644
--- a/drivers/mtd/chips/cfi_cmdset_0001.c
+++ b/drivers/mtd/chips/cfi_cmdset_0001.c
@@ -644,9 +644,8 @@ static int get_chip(struct map_info *map
 		 *
 		 * - contension arbitration is handled in the owner's context.
 		 *
-		 * The 'shared' struct can be read when its lock is taken.
-		 * However any writes to it can only be made when the current
-		 * owner's lock is also held.
+		 * The 'shared' struct can be read and/or written only when
+		 * its lock is taken.
 		 */
 		struct flchip_shared *shared = chip->priv;
 		struct flchip *contender;
@@ -675,14 +674,13 @@ static int get_chip(struct map_info *map
 			}
 			timeo = jiffies + HZ;
 			spin_lock(&shared->lock);
+			spin_unlock(contender->mutex);
 		}
 
 		/* We now own it */
 		shared->writing = chip;
 		if (mode == FL_ERASING)
 			shared->erasing = chip;
-		if (contender && contender != chip)
-			spin_unlock(contender->mutex);
 		spin_unlock(&shared->lock);
 	}
 

Nicolas




More information about the linux-mtd mailing list