UBIFS Corrupt during power failure
Eric Holmberg
Eric_Holmberg at Trimble.com
Mon May 18 13:30:40 EDT 2009
Hi Stefan,
I am still seeing corruption even with the write buffer size limited to
8 bytes, but it's greatly limited. Unfortunately our schedule doesn't
allow me to work on this full-time for the immediate future, so I'm
limited to small chunks of time for now. Let me know if there is
anything I can do to assist/share since it looks like we both are in
need of fixing this.
At this point, I believe I have characterized the interrupted erase and
interrupted write patterns that are causing the problems, so the next
step I may take is to add the failure conditions into the NOR MTD device
simulator mtdram and see if I can get the same failures.
Let me know if you have any other ideas of approaches.
Here's the patch to change the maximum write buffer size to 8 bytes
(2^3).
Index: drivers/mtd/chips/cfi_probe.c
===================================================================
--- drivers/mtd/chips/cfi_probe.c (revision 4477)
+++ drivers/mtd/chips/cfi_probe.c (working copy)
@@ -18,7 +18,7 @@
#include <linux/mtd/cfi.h>
#include <linux/mtd/gen_probe.h>
-//#define DEBUG_CFI
+#define DEBUG_CFI
#ifdef DEBUG_CFI
static void print_cfi_ident(struct cfi_ident *);
@@ -251,6 +251,18 @@
cfi->cfiq->InterfaceDesc =
le16_to_cpu(cfi->cfiq->InterfaceDesc);
cfi->cfiq->MaxBufWriteSize =
le16_to_cpu(cfi->cfiq->MaxBufWriteSize);
+ //DEBUG - BEGIN - force max write size to 8 bytes (2^3)
+ if (cfi->cfiq->MaxBufWriteSize)
+ {
+ printk("Warning: Overriding MaxBufWriteSize from 2^%d
to 2^%d\n",
+ cfi->cfiq->MaxBufWriteSize,
+ 3
+ );
+ cfi->cfiq->MaxBufWriteSize = 3;
+ }
+ //DEBUG - END
+
+
#ifdef DEBUG_CFI
/* Dump the information therein */
print_cfi_ident(cfi->cfiq);
Best Regards,
Eric Holmberg
Senior Firmware Engineer
Trimble Construction Services
Westminster, Colorado
> -----Original Message-----
> From: Stefan Roese [mailto:sr at denx.de]
> Sent: Friday, May 15, 2009 1:17 AM
> To: Eric Holmberg
> Cc: linux-mtd at lists.infradead.org; dedekind at infradead.org;
> Jamie Lokier; Urs Muff; Adrian Hunter
> Subject: Re: UBIFS Corrupt during power failure
>
> Hi Eric,
>
> On Saturday 18 April 2009 01:49:52 Eric Holmberg wrote:
> > > Yeah, let's wait for Eric's results and then will work on
> > > extending MTD device model with this parameter.
> >
> > As suggested, I patched my 2.6.27 kernel with the latest from
> > http://git.infradead.org/users/dedekind/ubifs-v2.6.27.git
> (includes all
> > updates up to and including fhe fix-recovery bug,
> >
> http://git.infradead.org/users/dedekind/ubifs-v2.6.27.git?a=co
> mmit;h=e14
> > 4c1c037f1c6f7c687de5a2cd375cb40dfe71e).
> >
> > I have the unit running with a maximum write buffer of 8
> bytes (the NOR
> > flash chip is capable of 64 bytes).
>
> How exactly did you do this? In cfi_cmdset_0002.c?
>
> > I was seeing 4 different failure scenarios with the base
> 2.6.27 code,
> > but now I am only seeing one remaining failure after 30+
> hours of power
> > cycling. I added a stack dump this afternoon that will let
> me pinpoint
> > exactly what is happening, but haven't seen the failure, yet.
> >
> > The failure happens when I get two corrupt empty LEB's. I
> believe the
> > scenario is that an erase is interrupted and on the next
> boot, while the
> > file system is being recovered, another power failure occurs.
> >
> > I can erase one of the LEB's manually in U-Boot and the file system
> > recovers properly.
> >
> > I'm going to leave the units running over the weekend and
> see what is
> > waiting for me Monday morning.
>
> Do you have an update for this? What's the current status on
> your system now?
> Which patches did you apply to work reliably with the Spansion FLASH?
>
> I'm asking since we are seeing a similar issue on one of our
> boards equipped
> with the S29GL512P. This simple script triggers problems upon
> the next mount:
>
> ---
> mount -t ubifs ubi0:testvolume /mnt
> sync
> reboot -n -f
> ---
>
> The next mount will result most of the time in this:
>
> UBIFS: recovery needed
> UBIFS error (pid 406): ubifs_scan: corrupt empty space at LEB 3:130320
> UBIFS error (pid 406): ubifs_scanned_corruption: corrupted
> data at LEB
> 3:130320
> UBIFS error (pid 406): ubifs_scan: LEB 3 scanning failed
> UBIFS error (pid 406): ubifs_recover_leb: corrupt empty space
> at LEB 3:32
> UBIFS error (pid 406): ubifs_scanned_corruption: corrupted
> data at LEB 3:32
> UBIFS error (pid 406): ubifs_recover_leb: LEB 3 scanning failed
> mount: Structure needs cleaning
>
> This is without the patch from this thread included (in
> recovery.c). With this
> patch included the recovery is successful all the time, as
> far as we can see
> right now. But I'm wondering if we really need to disable the
> write buffer in
> the CFI driver or reduce the write buffer to 8.
>
> Thanks.
>
> Best regards,
> Stefan
>
> =====================================================================
> DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office at denx.de
> =====================================================================
>
More information about the linux-mtd
mailing list