Recursion in CFI driver. Is it really necessary to have.

Nicolas Pitre nico at cam.org
Fri Oct 12 11:14:13 EDT 2007


On Fri, 12 Oct 2007, Alexey Korolev wrote:

> Nicolas,
> 
> > 
> > Please be more clear about your problem first.
> > 
> The problem I've found is kernel deadlock on JFFF2 simultaneous operations.
> Also debugging showed that the issue is caused by tons of recursions in
> get_chip call. This is the stack dump I could catch before the deadlock.
> 
> 124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c0124c24>] (get_chip+0x1e0/0x808)
> [<c0124a44>] (get_chip+0x0/0x808) from [<c01276d0>] (do_erase_oneblock+0x3c/0x700)
[...]

Wow.  That's certainly bad.

It's been a while that I wrote that code, but looking at it now I just 
can't see how that could happen.  It is like if it was recursing on 
itself, but the code explicitly guards against that.  See:

		if (contender && contender != chip) {
			[...]
			ret = get_chip(map, contender, contender->start, mode);

So from that point, 'chip' and 'contender' must be equal when get_chip() 
is called again, and therefore recursion should stop there.


Nicolas



More information about the linux-mtd mailing list