Numonyx NOR and chip->mutex bug?
Stefan Bigler
stefan.bigler at keymile.com
Thu Feb 3 18:18:44 EST 2011
Hi
Now I have a patch that fix my problem. I did not yet do a lot of tests
and I did
not yet analyzed correctness in all cases.
I check in the inval_cache_and_wait_for_operation() of the status is
status_OK
the if the chip->state and the chip_state (the current chip state and
the chip state
when the function is called) are equal or not. If equal break and exit
(set the
chip->state= FL_STATUS) and if not equal add the thread in the wait queue.
I think we are not finished when not equal so we have do not have to
take the mutex.
Now why this code worked with the older Flash chips. I don't know, but
eventually
they are slower after the write buffer command 0xe8 and they are still
busy when
inval_cache_and_wait_for_operation() checks if status_OK.
Tomorrow I'll check if this is the case. Now I do not have access to the HW.
Regards Stefan
diff --git a/drivers/mtd/chips/cfi_cmdset_0001.c
b/drivers/mtd/chips/cfi_cmdset_0001.c
index 28fc5c2..723985c 100644
--- a/drivers/mtd/chips/cfi_cmdset_0001.c
+++ b/drivers/mtd/chips/cfi_cmdset_0001.c
@@ static int inval_cache_and_wait_for_operation(
for (;;) {
status = map_read(map, cmd_adr);
- if (map_word_andequal(map, status, status_OK, status_OK))
+ if (map_word_andequal(map, status, status_OK, status_OK)) {
+ if (chip->state != chip_state) {
+ goto unequal_states;
+ }
break;
+ }
if (!timeo) {
@@ static int inval_cache_and_wait_for_operation(
cond_resched();
timeo--;
}
mutex_lock(&chip->mutex);
+unequal_states:
while (chip->state != chip_state) {
Am 02/03/2011 05:38 PM, schrieb Stefan Bigler:
> Hi
>
> I did some more anaysis of the write probem when creating a ubivolume.
> I come to the conclusion that there must be a problem in the fuction
> inval_cache_and_wait_for_operation().
>
> Why I think this. In the log below you see that in the first part the
> do_erase_oneblock() start the erasing. In the function
> inval_cache_and_wait_for_operation() this thread is interrupted.
>
> A do_write_buffer() suspends the erase by chip_ready() and runs until
> cmd 0xe8 and waits for completion in WAIT_TIMEOUT.
>
> Here again the do_erase_oneblock() continues to run and runs a
> erase resume sequence to the end of erase the block.
> --> THIS IS WRONG.
>
> Now the do_write_buffer() continue but it ends in a write error.
>
>
> When reading the inval_cache_and_wait_for_operation() there are 2
> places were the mutex is unlocked
>
> 1) while INVALIDATE_CACHED_RANGE()
> 2) if status is not status_OK
>
> My measurement shows that the mutex is taken in the first place.
>
> Now I do expect that this method should make sure that the erasing
> thread get "suspended" and the writing task can continue.
> But this is not the case.
> There is code were we compare the chip_state and chip->state.
> But we are not hitting this code.
>
> This is the summary of my analysis for today.
> I was asking me why this code is working on other chips?
>
> Regards
> Stefan
>
>
> do_erase_oneblock() begin
> [1507][0209] do_erase_oneblock start adr=0x00000000 len=0x20000
> [1507][0209] map_write 0x50 to 0x00000000
> [1507][0209] map_write 0x20 to 0x00000000
> [1507][0209] map_write 0xd0 to 0x00000000
> [1507][0209] do_erase_oneblock before INVAL_CACHE_AND_WAIT
> adr=0x00000000 len=0x20000
> [1507][0209] inval_cache_and_wait_for_operation short unlock
>
> do_write_buffer() begin
> [1507][0465] map_write 0xb0 to 0x03fe4c00
> [1515][0465] erase suspend 1 adr=0x03fe4c00
> [1515][0465] map_write 0x70 to 0x03fe4c00
> [1515][0465] map_write 0xe8 to 0x03fe4c00
> [1515][0465] buffer write before 0xe8 WAIT_TIMEOUT
> [1515][0465] inval_cache_and_wait_for_operation short unlock
>
> do_erase_oneblock() continue & finish
> [1515][0209] inval_cache_and_wait_for_operation short lock
> [1515][0209] do_erase_oneblock after INVAL_CACHE_AND_WAIT
> adr=0x00000000 len=0x20000
> [1515][0209] map_write 0x70 to 0x00000000
> [1515][0209] map_write 0x50 to 0x00000000
> [1515][0209] map_write 0xd0 to 0x00000000
> [1515][0209] map_write 0x70 to 0x00000000
> [1523][0209] erase resumed 2b adr=0x00000000
> [1527][0209] do_erase_oneblock end adr=0x00000000 len=0x20000
>
> do_write_buffer() continue & error
> [1527][0465] inval_cache_and_wait_for_operation short lock
> [1531][0465] buffer write after 0xe8 WAIT_TIMEOUT
> [1531][0465] buffer write data to buf words=511
> [1531][0465] buffer write data buf done
> [1531][0465] map_write 0xd0 to 0x03fe4c00
> [1531][0465] 50000000.flash: buffer write before INVAL_CACHE_AND_WAIT
> adr=0x03fe5000 len=0x400
> [1531][0465] inval_cache_and_wait_for_operation short unlock
> [1531][0465] inval_cache_and_wait_for_operation short lock
> [1531][0465] inval_cache_and_wait_for_operation long unlock
> [1543][0465] inval_cache_and_wait_for_operation long lock
>
> [2503][0465] inval_cache_and_wait_for_operation long unlock
> [2511][0465] inval_cache_and_wait_for_operation long lock
> [2511][0465] 50000000.flash: buffer write after INVAL_CACHE_AND_WAIT
> adr=0x03fe5000 len=0x400
> [2511][0465] map_write 0x50 to 0x03fe4c00
> [2511][0465] map_write 0x70 to 0x03fe4c00
> [2519][0465] 50000000.flash: buffer write error status 0xb0
> adr=0x03fe5000 len=0x400
>
> Am 02/03/2011 04:24 PM, schrieb Michael Cashwell:
>> On Feb 3, 2011, at 4:50 AM, Joakim Tjernlund wrote:
>>
>>>>> My failures were caused by the erase-resume status read "seeing" a
>>>>> non-busy WSM. Adding a 20µs delay before that read (actually after
>>>>> the command write, but it amounts to the same thing) prevents this
>>>>> from happening. By the time awakened "wait-for-completion" code
>>>>> does its first status read it sees the WSM as busy as it should
>>>>> until the erase actually finishes.
>>> Perhaps you can test for ESS(SR.6) too?
>>> something like:
>>> if DWS&& !ESS
>> Hmmm. So instead of a blind delay it would loop (with a timeout)
>> while DWS&& ESS are both set (meaning the erase resume command just
>> written hasn't yet taken effect). It does make sense that if the
>> resume takes time to drop SR.7 (ready) that it would also take time
>> to drop SR.6 (ESS).
>>
>> I like that. I'll add that to the set of tests today and report back.
>>
>> -Mike
>>
>
More information about the linux-mtd
mailing list