UBIFS Corrupt during power failure

Fri Apr 17 19:49:52 EDT 2009

> > > 
> > > Then I guess we should just introduce mtd->max_corruption 
> ? This would
> > > mean maximum amount of bytes corruption may span in vase 
> of power cuts?
> > 
> > With that name maybe whoever implements striping will remember think
> > about parallelism and limit it :-)
> > 
> >   /* Max size of corrupted block when a write command is interrupted
> >      by reset or power failure. */
> >   u32 max_write_corruption;

I like this suggestion -- good variable name.

> 
> Yeah, let's wait for Eric's results and then will work on
> extending MTD device model with this parameter.
> 

As suggested, I patched my 2.6.27 kernel with the latest from
http://git.infradead.org/users/dedekind/ubifs-v2.6.27.git (includes all
updates up to and including fhe fix-recovery bug,
http://git.infradead.org/users/dedekind/ubifs-v2.6.27.git?a=commit;h=e14
4c1c037f1c6f7c687de5a2cd375cb40dfe71e).

I have the unit running with a maximum write buffer of 8 bytes (the NOR
flash chip is capable of 64 bytes).

I was seeing 4 different failure scenarios with the base 2.6.27 code,
but now I am only seeing one remaining failure after 30+ hours of power
cycling.  I added a stack dump this afternoon that will let me pinpoint
exactly what is happening, but haven't seen the failure, yet.

The failure happens when I get two corrupt empty LEB's.  I believe the
scenario is that an erase is interrupted and on the next boot, while the
file system is being recovered, another power failure occurs.

I can erase one of the LEB's manually in U-Boot and the file system
recovers properly.

I'm going to leave the units running over the weekend and see what is
waiting for me Monday morning.

Thanks for your help so far and have a great weekend!

-Eric

P.S.  I am scheduled to work on some higher-priority items next week, so
I won't be able to work on the max_write_corruption code.