[PATCH] cfi: Fixup of write errors on XIP

Nicolas Pitre nico at cam.org
Thu Mar 2 10:36:46 EST 2006


On Wed, 1 Mar 2006, Korolev, Alexey wrote:

> Hi all,
> 
> Several days ago I sent this patch to the list. I wonder are there any
> objections to it?

I do have some.

> I guess this patch shouldn't break anything. It just fixes write errors
> I saw in our tests. The patch has been verified on several
> configurations. 

It is not a proper fix.  Something else doesn't work as expected.

> If you don't mind I will commit it by the end of this week.

Please don't just yet.

> The scenario of the issue is following:
>  
> 1. do_write_buffer 
> 2. Waiting for write complete in xip_udelay
> 3. System Interrupt
> 4. Write suspend
> 5. Rescheduling
> 6. Block erasing by other process. ( This operation typically took
> rather long time )
> 7. Complete, rescheduling 
> 8. Return to write (write is not complete due to suspend ). 
> 9. Check timeout. Time is up.
> 10. Error.

This should not happen.  And if it does then the bug is in xip_udelay() 
and therefore should be fixed there.

The fact is, xip_udelay() should not return until either the flash 
status is 0x80 (done) or the delay expired.  The code looks like:

	...
	start = xip_curtime();
	...
	do {
		...
		if (xip_irqpending()...) {
			/* suspend code */
			...
			[here we substract the time waited so far]
			usec -= xip_elapsed_since(start);
			...
			/* schedule code */
			...
			/* Resume the write or erase operation */
			map_write(map, CMD(0xd0), adr);
			map_write(map, CMD(0x70), adr);
			chip->state = oldstate;
			start = xip_currtime();
		}
		...
		status = map_read(map, adr);
	} while (!map_word_andequal(map, status, OK, OK)
		 && xip_elapsed_since(start) < usec);

So, even if this gets rescheduled a long time after an erase occurred in 
another process, the "start" variable will get set to the new time when 
the write is resumed and the erase delay is therefore already accounted 
for.

What if you replace "usec -= xip_elapsed_since(start)" by
"usec -= xip_elapsed_since(start) / 2" instead?  This doesn't have to be 
exactly accurate -- in fact this is there just to avoid looping forever 
if the flash never return a success status.



Nicolas




More information about the linux-mtd mailing list