Testing generic empty page bit flips recovery

Boris Brezillon boris.brezillon at free-electrons.com
Wed Dec 30 07:55:28 PST 2015


On Wed, 30 Dec 2015 09:33:52 -0600
"Franklin S Cooper Jr." <fcooper at ti.com> wrote:

> 
> 
> On 12/30/2015 08:40 AM, Boris Brezillon wrote:
> > Hi Franklin,
> >
> > On Wed, 30 Dec 2015 08:10:20 -0600
> > "Franklin S Cooper Jr." <fcooper at ti.com> wrote:
> >
> >> I am trying to follow up on this discussion from this patch
> >> set (https://patchwork.ozlabs.org/patch/539059/) which
> >> suggested that Michael instead test the generic bitflips
> >> recovery that is implemented by Boris "mtd: nand: properly
> >> handle bitflips in erased pages" patchset
> >> (http://lists.infradead.org/pipermail/linux-mtd/2015-September/061617.html).
> >> I would like to test Boris patchset but first I need to
> >> recreate the error that his patch is fixing.
> >>
> >> The error that the patchset is attempting to fix isn't
> >> something I have ever encountered before. Currently I am
> >> trying to reproduce this issue on a TI K2E evm that uses the
> >> davinci nand driver. I flashed the nand's file-system
> >> partition with a ubi filesystem and the board is currently
> >> set to boot using the file-system on the nand. After about
> >> 60 secs I cut the power from the board and boot the board
> >> again. What I would expect is that the board will eventually
> >> fail to mount the ubi filesystem but currently the board has
> >> ran for over 24 hours and powered on and off over 1400 times
> >> and its still mounting the file-system perfectly fine.
> >>
> >> Any suggestions on a test case that I can use to force the
> >> empty page bit flips error?
> >>
> >>
> > The davinci driver seems to support raw accesses, so you can try to
> > apply this patch [1] against the mtd-utils tree (not sure it still
> > applies cleany, but it should work with mtd-utils-1.5.1), and use the
> > nandflipbits tool:
> >
> > # flash_erase /dev/mtdX <offset> 1
> > # nandflipbits /dev/mtdX 1@<offset>
> > # nanddump -f /tmp/dump -s <offset> -l <page-size> /dev/mtdX
> >
> > Without the patch, nanddump should complain about uncorrectable errors,
> > and if you hexdump /dev/dump you should see the bitflip.
> > If nanddump does not complain after applying my patch, then it means it
> > fixes the "bitflips in erased pages" bug.
> >
> > Best Regards,
> >
> > Boris
> >
> > [1]http://lists.infradead.org/pipermail/linux-mtd/2014-November/056634.html
> 
> Hi Boris,
> 
> Thanks for the quick reply. I built mtd-utils with your
> patch and ran the suggested commands on a 4.1 based kernel
> without your kernel patchset and I didn't see your expected
> output. The 4.1 based kernel hasn't had any changes to
> davinci_nand or nand subsystem that would address this
> bitflip error.
> 
> I'm currently going to attempt to run the same test on the
> latest mainline.
> 
> Here is the output I received when I ran your suggested
> commands on the 4.1 based kernel.Any
> root at k2e-evm:~# ./flash_erase /dev/mtd4 4096 1
> Erasing 128 Kibyte @ 0 -- 100 % complete
> root at k2e-evm:~# ./nandflipbits /dev/mtd4 1 at 4096
> root at k2e-evm:~# ./nanddump -f /tmp/dump -s 4096 -l 2048

You should probably use a block aligned offset (in your case a block is
128k), but that's not the problem here.

> /dev/mtd4
> ECC failed: 0
> ECC corrected: 0
> Number of bad blocks: 0
> Number of bbt blocks: 4
> Block size 131072, page size 2048, OOB size 64
> root at k2e-evm:~# hexdump /tmp/dump
> 0000000 fffd ffff ffff ffff ffff ffff ffff ffff

             ^
The bitflip is here.

> 0000010 ffff ffff ffff ffff ffff ffff ffff ffff
> *
> 0000800
> 
> Any thoughts on why I'm not seeing the expected error?
> 

Is ecc4bit mode really selected (ti,davinci-ecc-bits = 4 in your DT
node)?
You can add a trace there [1] to check that.

[1]http://lxr.free-electrons.com/source/drivers/mtd/nand/davinci_nand.c?v=4.1#L706

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com



More information about the linux-mtd mailing list