Flash corruption

Gil Weber gilw at cse-semaphore.com
Wed Oct 5 05:43:55 EDT 2011


Hi all,
We are using jffs2 in an embedded system with a kernel 2.6.27 and 
sometimes,
we see a file system corruption that we can't explain (no power fail, ...).
This corruption follows some basic write operations to update our software.

Once the update failed, if we connect to the device, we are unable to
write anything and we see this kind of error in the logs:

   Write of 145 bytes at 0x0096e490 failed. returned -5, retlen 144
   Write of 145 bytes at 0x0096e524 failed. returned -5, retlen 144
   Node totlen on flash (0x00000000) != totlen from node ref (0x000003e4)

We first suspect an hardware issues, because it only occurs on new devices.
But the strange thing is that we are unable to reproduce it. If we get a
device that already had a corruption and reinitialize its flash, all works
perfect! Even if we do hundreds of update, continuously, during a few week!
I also run different stresstest for jffs2 and the flash, and all seems ok.

This is why I now suspect a timing problem... and it seems that it occurs
only on new devices with a "fresh" flash. Is it possible?
Or maybe it can depends of an external factor, like the temperature?

Our flash is a Spansion GL128N90 16Mb and we are using CFI command set 
0002.
I see that it may have a weakness in the cfi_cmdset0002.c as the write and
erase timeouts are hardcoded. Can these timeouts be to small?

What do you think?

Best regards,
Gil Weber




More information about the linux-mtd mailing list