[PATCH] cfi: Fixup of write errors on XIP
Nicolas Pitre
nico at cam.org
Thu Mar 2 10:36:46 EST 2006
On Wed, 1 Mar 2006, Korolev, Alexey wrote:
> Hi all,
>
> Several days ago I sent this patch to the list. I wonder are there any
> objections to it?
I do have some.
> I guess this patch shouldn't break anything. It just fixes write errors
> I saw in our tests. The patch has been verified on several
> configurations.
It is not a proper fix. Something else doesn't work as expected.
> If you don't mind I will commit it by the end of this week.
Please don't just yet.
> The scenario of the issue is following:
>
> 1. do_write_buffer
> 2. Waiting for write complete in xip_udelay
> 3. System Interrupt
> 4. Write suspend
> 5. Rescheduling
> 6. Block erasing by other process. ( This operation typically took
> rather long time )
> 7. Complete, rescheduling
> 8. Return to write (write is not complete due to suspend ).
> 9. Check timeout. Time is up.
> 10. Error.
This should not happen. And if it does then the bug is in xip_udelay()
and therefore should be fixed there.
The fact is, xip_udelay() should not return until either the flash
status is 0x80 (done) or the delay expired. The code looks like:
...
start = xip_curtime();
...
do {
...
if (xip_irqpending()...) {
/* suspend code */
...
[here we substract the time waited so far]
usec -= xip_elapsed_since(start);
...
/* schedule code */
...
/* Resume the write or erase operation */
map_write(map, CMD(0xd0), adr);
map_write(map, CMD(0x70), adr);
chip->state = oldstate;
start = xip_currtime();
}
...
status = map_read(map, adr);
} while (!map_word_andequal(map, status, OK, OK)
&& xip_elapsed_since(start) < usec);
So, even if this gets rescheduled a long time after an erase occurred in
another process, the "start" variable will get set to the new time when
the write is resumed and the erase delay is therefore already accounted
for.
What if you replace "usec -= xip_elapsed_since(start)" by
"usec -= xip_elapsed_since(start) / 2" instead? This doesn't have to be
exactly accurate -- in fact this is there just to avoid looping forever
if the flash never return a success status.
Nicolas
More information about the linux-mtd
mailing list