JFFS2 errors on ppc-4xx with CFI NOR flash
massimo cirillo
maxcir at gmail.com
Thu Mar 4 15:16:43 EST 2010
Hi,
Sorry my earlier message was rejected because
of suspicious header.
Ryan,
Please specify the complete part number of the
flash device and I'll try to help you. As first
attempt try to disable erase suspend feature in
the flash driver.
2010/3/4 Ryan Thompson <i at ry.ca>
>
> Hi,
>
> We've been seeing errors on our ~32MiB jffs2 filesystem on a custom
> ppc-4xx board with a Numonyx 128MiB CFI NOR flash (128KiB erase
> blocks).
>
> The filesystem is mounted from /dev/mtd/modules, which is a symlink to
> /dev/mtdblock16, defined in the FDT as follows:
>
> /* Modules (32128 KiB) */
> partition at 2e80000 {
> reg = <0x2E80000 0x1F60000>;
> label = "modules";
> };
>
> When a significant amount of data (i.e., a few files over a few megs
> each) is written to the filesystem, we start seeing erase block
> errors, checksum failures, and garbage collection errors. However,
> these same filesystems have been in steady R&D use on this hardware
> and same 2.6.28 kernel for ~6 months without issue. Until recently our
> use case has only involved writing a small number of tiny files. We
> only started seeing errors when we began to write larger files.
>
> (Console output is at the end of this message.)
>
> I have been able to reproduce this problem on multiple systems with
> the following script (/mnt is defined in fstab(5) as jffs2 with
> noatime,noauto,rw):
>
> --------------------------
>
> #!/bin/sh
>
> umount /mnt && mount /mnt
> cd /mnt
> df /mnt
> while dd if=/dev/urandom of=`mktemp` count=512 2>/dev/null; do
> sync
> df /mnt | tail -1
> done
> echo "Filesystem full?"
> sync; sync
> df /mnt
> rm *.tmp
> sync; sync; sync
> df /mnt | tail -1
>
> --------------------------
>
> The errors tend to occur just after df shows approximately 52-55%
> (perhaps garbage collection starts around this time?)
>
> This occurs on at least 2.6.31. After the errors occur, the filesystem
> is unusable until I reboot the system (the errors just keep repeating,
> all reads and writes fail). However when the system is rebooted the
> filesystem seems to (silently) recover and be completely intact.
>
> We saw essentially the same errors on 2.6.28, but the kernel would
> panic. With 2.6.31, there is no panic, but a reboot is still necessary
> to restore operation.
>
> We use other partitions of the flash in block mode through mtdblockXX.
> As a test, I also formatted mtdblock16 (the jffs2 partition) as vfat.
> (Yes, I know how horrible this is!) With vfat, the above script filled
> the filesystem 10 times without issue (except for significantly
> reducing the lifespan of my flash part, I'm sure!) I additionally used
> a more complex version of the above script in some of my trials to
> store and verify the md5 sums of the random files written after the
> vfat filesystem was full; all files verified successfully.
>
> Here's the console output from one such incident:
>
> ----------- Console output --------------
> Newly-erased block contained word 0x19850003 at offset 0x01f20000
> Jan 1 00:06:15 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01f20000
> /dev/mtd/modules 32128 16916 15212 53% /mnt
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
> Jan 1 00:06:16 rjt Node totlen on flash (0xffffffff) != totlen from
> node ref (0x00000044)
> kernel: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Jan 1 Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> 00:06:16 TerraceNode CRC ffffffff != calculated CRC f09e7845 for node
> at 01e162f0
> Q kernel: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
> Jan 1 00:06:16 rjtNewly-erased block contained word 0x19850003 at
> offset 0x01d20000
> kerneNewly-erased block contained word 0x19850003 at offset 0x01d00000
> l: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000244)
> Filesystem full?
> Jan 1 00:06:16 rjt last message 'kernel: Node totlen on flash
> (0xffffffff) != totlen from node ref
> Jan 1 00:06:16 rjt kernel: Node CRC ffffffff != calculated CRC
> f09e7845 for node at 01e162f0
> Jan 1 00:06:16 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01d20000
> Jan 1 00:06:16 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01d00000
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/mtd/modules 32128 17172 14956 53% /mnt
> rm: cannot remove '*.tmp': No such file or directory
> /dev/mtd/modules 32128 17172 14956 53% /mnt
> # Newly-erased block contained word 0x19850003 at offset 0x01ce0000
> Jan 1Newly-erased block contained word 0x19850003 at offset 0x01ca0000
> 00:06Newly-erased block contained word 0x19850003 at offset 0x01cc0000
> :20 Newly-erased block contained word 0x19850003 at offset 0x01c80000
> rjt kernel: Newly-erased block contained word 0x19850003 at offset 0x01ce0000
> Jan 1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01ca0000
> Jan 1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01cc0000
> Jan 1 00:06:20 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c80000
> Newly-erased block contained word 0x19850003 at offset 0x01c60000
> Jan 1Newly-erased block contained word 0x19850003 at offset 0x01c40000
> 00:06Newly-erased block contained word 0x19850003 at offset 0x01c20000
> :25 rjNewly-erased block contained word 0x19850003 at offset 0x01c00000
> t Newly-erased block contained word 0x19850003 at offset 0x01be0000
> kernel:Newly-erased block contained word 0x19850003 at offset 0x01bc0000
> Newly-erased block contained word 0x19850003 at offset 0x01c60000
> Jan 1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c40000
> Jan 1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c20000
> Jan 1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01c00000
> Jan 1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01be0000
> Jan 1 00:06:25 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01bc0000
> Newly-erased block contained word 0x19850003 at offset 0x01ba0000
> Jan 1 Newly-erased block contained word 0x19850003 at offset 0x01b80000
> 00:06:Argh. No free space left for GC. nr_erasing_blocks is 0.
> nr_free_blocks is 0. (erasableempty: yes, erasingempty: yes,
> erasependingempty: yes)
> 26 rjtjffs2_reserve_space_gc of 196 bytes for garbage_collect_dnode failed: -28
> Error garbage collecting node at 01b6db84!
> kerneNo space for garbage collection. Aborting GC thread
> l: Newly-erased block contained word 0x19850003 at offset 0x01ba0000
> Jan 1 00:06:26 rjt kernel: Newly-erased block contained word
> 0x19850003 at offset 0x01b80000
> Jan 1 00:06:26 rjt kernel: Argh. No free space left for GC.
> nr_erasing_blocks is 0. nr_free_blocks is 0. (erasableempty: yes,
> erasingempty: yes, erasependingempty: yes)
> Jan 1 00:06:26 rjt kernel: jffs2_reserve_space_gc of 196 bytes for
> garbage_collect_dnode failed: -28
> Jan 1 00:06:26 rjt kernel: Error garbage collecting node at 01b6db84!
> Jan 1 00:06:26 rjt kernel: No space for garbage collection. Aborting GC thread
>
> I'd of course welcome any advice or further debugging suggestions.
>
> Thanks,
> - R
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
More information about the linux-mtd
mailing list