UBIFS Corrupt during power failure

Fri Apr 10 10:27:04 EDT 2009

> On Mon, 2009-03-30 at 13:00 -0600, Eric Holmberg wrote:
> > Here is a basic summary of my findings to date for debugging 
> > corruption of the root UBIFS volume which is located on NOR flash.  
> > Please comment if you have any suggestions.
> > 
> 
> Hi, any news? Have you tried to enable UBI/UBIFS extra checks?

Hi Artem,

I have enabled the extra checks and the failure messages and they didn't
provide any additional information.  Since we have custom hardware, I
wrote some software that writes test patterns to the flash in U-Boot to
verify that we do not have an underlying problem with the NOR flash.
The software writes a test pattern to each physical erase block (PEB)
and then randomly erases and rewrites a sector with the test pattern
AA55AA## where ## is the block number.  A script then performs a
hardware reset of the processor and flash using JTAG.  All physical
sectors are verified after the erase and again after the write.

I did not see any issues where multiple PEB's were corrupted, but the
block being written or erased had some unexpected patterns when the
flash was reset during the middle of an operation.

Test setup:
 * Using U-Boot 1.3.0
 * Write buffering enabled
 * S29GL256F 256Mbit NOR flash w/ 32-word write buffer
 * Test software that performs read/erase/write operations
 * JTAG debugger that randomly resets the board

Reset during write (unexpected test pattern written after un-programmed
values):

30352240  aa55aa0a aa55aa0a aa55aa0a aa55aa0a
30352250  aa55aa0a aa55aa0a aa55aa0a aa55aa0a
30352260  aa55aa0a aa55aa0a aa55aa0a aa55aa0a
30352270  aa55aa0a aa55aa0a aa55aa0a aa55aa0a
30352280  ffffffff ffffffff ffffffff ffffffff
30352290  ffffffff ffffffff ffffffff ffffffff
303522a0  ffffffff ffffffff ffffffff ffffffff
303522b0  aa55aa0a aa55aa0a aa55aa0a aa55aa0a
303522c0  ffffffff ffffffff ffffffff ffffffff
303522d0  ffffffff ffffffff ffffffff ffffffff
303522e0  ffffffff ffffffff ffffffff ffffffff

Reset during erase (unexpected - 1's change to zeros during erase):

30249930  aa55aa02 aa55aa02 aa55aa02 aa55aa02
30249940  aa55aa02 aa55aa02 aa55aa02 aa55aa02
30249950  8a51aa02 aa55aa02 a855aa02 aa55aa02
30249960  00000000 00000000 00000000 00000000
30249970  00000000 00000000 00000000 00000000
30249980  00000000 00000000 00000000 00000000
30249990  00000000 00000000 00000000 00000000
302499a0  00000000 00000000 00000000 80000000
302499b0  02000001 00000000 00000000 00000000
302499c0  00040000 00000000 00000000 80000000

Reset during erase (expected erase behavior - 0 not yet changed to 1):
30248ed0  ffffffff ffffffff ffffffff ffffffff
30248ee0  ffffffef ffffffff ffffffff ffffffff
30248ef0  ffffffff ffffffff ffffffff ffffffff

Questions
---------
How are interrupted writes or erase cycles handled in UBI / UBIFS for
NOR flash?  Are the unexpected PEB values that I am seeing properly
handled by the UBI/UBIFS error recovery process?  Are erase and write
operations journaled to allow restarting the process upon boot-up?

As a side note, MTD_BIT_WRITEABLE is not set for the NOR flash.  Is this
to be expected?  Do I need to set this in the partition table?  The NOR
flash does support programming a 1 to a 0, which is what I'm assuming
MTD_BIT_WRITEABLE means.

root at device:/# mtd_debug info /dev/mtd3
mtd.type = MTD_NORFLASH
mtd.flags = MTD_CAP_NORFLASH
mtd.size = 29360128 (28M)
mtd.erasesize = 131072 (128K)
mtd.writesize = 1
mtd.oobsize = 0
regions = 0

Thanks for your help and time!

Regards,

Eric Holmberg