JFFS2 errors on ppc-4xx with CFI NOR flash

Ryan Thompson i at ry.ca
Thu Mar 4 15:33:37 EST 2010


Hi Massimo,

The flash device is a Numonyx StrataFlash P33-65nm (1 Gbit version),
p/n PC28F00AP33EF.

I'll disable erase suspend and share my observations.

Thanks!
- R

On Thu, Mar 4, 2010 at 1:03 PM, massimo cirillo <maxcir at gmail.com> wrote:
> Please specify the complete part number of the flash device and I'll try to
> help you. As first attempt try to disable erase suspend feature in the flash
> driver.
>
> -----Messaggio originale-----
> Da: linux-mtd-bounces at lists.infradead.org
> [mailto:linux-mtd-bounces at lists.infradead.org] Per conto di Ryan Thompson
> Inviato: giovedì 4 marzo 2010 19.26
> A: linux-mtd at lists.infradead.org
> Oggetto: JFFS2 errors on ppc-4xx with CFI NOR flash
>
> Hi,
>
> We've been seeing errors on our ~32MiB jffs2 filesystem on a custom
> ppc-4xx board with a Numonyx 128MiB CFI NOR flash (128KiB erase
> blocks).
>
> The filesystem is mounted from /dev/mtd/modules, which is a symlink to
> /dev/mtdblock16, defined in the FDT as follows:
>
>            /* Modules (32128 KiB) */
>            partition at 2e80000 {
>                reg = <0x2E80000 0x1F60000>;
>                label = "modules";
>            };
>
> When a significant amount of data (i.e., a few files over a few megs
> each) is written to the filesystem, we start seeing erase block
> errors, checksum failures, and garbage collection errors. However,
> these same filesystems have been in steady R&D use on this hardware
> and same 2.6.28 kernel for ~6 months without issue. Until recently our
> use case has only involved writing a small number of tiny files. We
> only started seeing errors when we began to write larger files.
>
> (Console output is at the end of this message.)
>
> I have been able to reproduce this problem on multiple systems with
> the following script (/mnt is defined in fstab(5) as jffs2 with
> noatime,noauto,rw):
>
> --------------------------
>
> #!/bin/sh
>
> umount /mnt && mount /mnt
> cd /mnt
> df /mnt
> while dd if=/dev/urandom of=`mktemp` count=512 2>/dev/null; do
>    sync
>    df /mnt | tail -1
> done
> echo "Filesystem full?"
> sync; sync
> df /mnt
> rm *.tmp
> sync; sync; sync
> df /mnt | tail -1
>
> --------------------------
>
> The errors tend to occur just after df shows approximately 52-55%
> (perhaps garbage collection starts around this time?)
>
> This occurs on at least 2.6.31. After the errors occur, the filesystem
> is unusable until I reboot the system (the errors just keep repeating,
> all reads and writes fail). However when the system is rebooted the
> filesystem seems to (silently) recover and be completely intact.
>
> We saw essentially the same errors on 2.6.28, but the kernel would
> panic. With 2.6.31, there is no panic, but a reboot is still necessary
> to restore operation.
>
> We use other partitions of the flash in block mode through mtdblockXX.
> As a test, I also formatted mtdblock16 (the jffs2 partition) as vfat.
> (Yes, I know how horrible this is!) With vfat, the above script filled
> the filesystem 10 times without issue (except for significantly
> reducing the lifespan of my flash part, I'm sure!) I additionally used
> a more complex version of the above script in some of my trials to
> store and verify the md5 sums of the random files written after the
> vfat filesystem was full; all files verified successfully.
>
> Here's the console output from one such incident:
>
> ----------- Console output --------------
> Newly-erased block contained word 0x19850003 at offset 0x01f20000
> Jan  1 00:06:15 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01f20000
> /dev/mtd/modules         32128     16916     15212  53% /mnt
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
> Jan  1 00:06:16 rjt Node totlen on flash (0xffffffff) != totlen from
> node ref (0x00000044)
> kernel: Node totlen on flash (0xffffffff) != totlen from node ref
> (0x00000044)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Jan  1 Node totlen on flash (0xffffffff) != totlen from node ref
> (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> 00:06:16 TerraceNode CRC ffffffff != calculated CRC f09e7845 for node
> at 01e162f0
> Q kernel: Node totlen on flash (0xffffffff) != totlen from node ref
> (0x00000044)
> Jan  1 00:06:16 rjtNewly-erased block contained word 0x19850003 at
> offset 0x01d20000
>  kerneNewly-erased block contained word 0x19850003 at offset 0x01d00000
> l: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Filesystem full?
> Jan  1 00:06:16 rjt last message 'kernel: Node totlen on flash
> (0xffffffff) != totlen from node ref
> Jan  1 00:06:16 rjt kernel: Node CRC ffffffff != calculated CRC
> f09e7845 for node at 01e162f0
> Jan  1 00:06:16 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01d20000
> Jan  1 00:06:16 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01d00000
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/mtd/modules         32128     17172     14956  53% /mnt
> rm: cannot remove '*.tmp': No such file or directory
> /dev/mtd/modules         32128     17172     14956  53% /mnt
> # Newly-erased block contained word 0x19850003 at offset 0x01ce0000
> Jan  1Newly-erased block contained word 0x19850003 at offset 0x01ca0000
>  00:06Newly-erased block contained word 0x19850003 at offset 0x01cc0000
> :20 Newly-erased block contained word 0x19850003 at offset 0x01c80000
> rjt kernel: Newly-erased block contained word 0x19850003 at offset
> 0x01ce0000
> Jan  1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01ca0000
> Jan  1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01cc0000
> Jan  1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c80000
> Newly-erased block contained word 0x19850003 at offset 0x01c60000
> Jan  1Newly-erased block contained word 0x19850003 at offset 0x01c40000
>  00:06Newly-erased block contained word 0x19850003 at offset 0x01c20000
> :25 rjNewly-erased block contained word 0x19850003 at offset 0x01c00000
> t Newly-erased block contained word 0x19850003 at offset 0x01be0000
> kernel:Newly-erased block contained word 0x19850003 at offset 0x01bc0000
>  Newly-erased block contained word 0x19850003 at offset 0x01c60000
> Jan  1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c40000
> Jan  1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c20000
> Jan  1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c00000
> Jan  1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01be0000
> Jan  1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01bc0000
> Newly-erased block contained word 0x19850003 at offset 0x01ba0000
> Jan  1 Newly-erased block contained word 0x19850003 at offset 0x01b80000
> 00:06:Argh. No free space left for GC. nr_erasing_blocks is 0.
> nr_free_blocks is 0. (erasableempty: yes, erasingempty: yes,
> erasependingempty: yes)
> 26 rjtjffs2_reserve_space_gc of 196 bytes for garbage_collect_dnode failed:
> -28
>  Error garbage collecting node at 01b6db84!
> kerneNo space for garbage collection. Aborting GC thread
> l: Newly-erased block contained word 0x19850003 at offset 0x01ba0000
> Jan  1 00:06:26 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01b80000
> Jan  1 00:06:26 rjt kernel: Argh. No free space left for GC.
> nr_erasing_blocks is 0. nr_free_blocks is 0. (erasableempty: yes,
> erasingempty: yes, erasependingempty: yes)
> Jan  1 00:06:26 rjt kernel: jffs2_reserve_space_gc of 196 bytes for
> garbage_collect_dnode failed: -28
> Jan  1 00:06:26 rjt kernel: Error garbage collecting node at 01b6db84!
> Jan  1 00:06:26 rjt kernel: No space for garbage collection. Aborting GC
> thread
>
> I'd of course welcome any advice or further debugging suggestions.
>
> Thanks,
> - R
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>
>



More information about the linux-mtd mailing list