NAND and JFFS2 crash
simon at baydel.com
simon at baydel.com
Thu Apr 24 06:22:06 EDT 2003
Thomas,
I checked into what you had said. The filesystem in question is the
root filesystem and it gets mounted and dismounted at startup and
shutdown. I cannot see how I this could be my problem. As you
seem to be a busy man I thought I would not bother you again and
I would try an update at a later date.
Last week I downloaded a new CVS tree. I create my SMC data by
booting the system off a hard disk running Linux. I first use dd to
copy the hard disk boot partition to the SMC. I noticed all these
messages basically saying writing NAND witout ECC was a bad
idea. In my NAND specific driver I set up the mtd_info structure for
soft ecc. However there appears to be a new field useecc which
only appears to be used by jffs2. I did not know what I was
expected to do here so I modified my driver to set this and the
associated bit positions. Beacuse I use partitions I had to modify
mtdpart to copy this information to the mtd_info structure which is
set up on a partition basis. Now I could boot from the hard disk and
copy my boot disk to the SMC with no problem. I then erased and
created a new JFFS2 filesystem, on another partition, and copied
all the files for the root filesystem.
I then booted from the smc and although I got a few
Empty flash at 0x00469ffcb ends at 0x0046a000
messages all seemed ok. The root file system was mounted and I
got the login prompt. However when I started to log in I got a crash.
kernel BUG at gc.c:140!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c018bb28>] Not tainted
EFLAGS: 00010296
eax: 0000003f ebx: 000000d4 ecx: c0262220 edx:
0000c200
esi: 000000d4 edi: 0000106e ebp: cffc04cc esp:
cfbc5f1c
ds: 0018 es: 0018 ss: 0018
Process jffs2_gcd_mtd2 (pid: 22, stackpage=cfbc5000)
Stack: 00000000 c0111ce6 cfbc5f50 cfbc4000 cfe6a120
cfe6a120 cfbc4000 00000000
cfbc4000 00000000 cfbc4000 cffc04cc cfbc4564
c018ea16 cffc04cc cfbc4574
cffc04cc 00000001 00000000 00000080 00000000
00000000 00000000 00000000
Call Trace: [<c0111ce6>] [<c018ea16>] [<c0108be6>]
[<c018e890>] [<c01073f6>]
[<c018e890>]
Code: 0f 0b 8c 00 b9 8f 25 c0 8b 45 08 8b 55 08 40 52
89 45 08 55
I have noticed someone else post a similar crash in the list and
you suggest sending a dump of the SMC.
I would like to know if you could assist me in the same way. If so
do you need a dump of the whole SMC or just the JFFS2 partition
?
During playing about with this I also noticed
a message similar to
jffs2_scan_dirent_node(): Node CRC failed on node at 0x0046a7f0
read 0xffffffff calculated 0xdec8161b
but the routine was jffs2_scan_inode_node, so I guess I am still
loosing data somewhere ?
To be able to use this technology I need to make it reliable. Can
you suggest how I might find the cause of this problem ?
Enable a specific debug level ?
Check hardware by writing patterns via the raw device ?
Many Thanks
Simon
On 6 Jan 2003, at 19:59, Thomas Gleixner wrote:
> On Monday 06 January 2003 18:04, simon at baydel.com wrote:
> > I download the CVS stuff mid December and again today. The
> > hardware ran ok before and could use jffs2 without errors but
as I
> > added files it was slow and I could not make file systems on
> > partitions which contained bad blocks.
> >
> > The new CVS code seems to be much quicker and I can
erase,
> > mount and copy files to my new filesystem without error. I have
set
> > up the specific driver to do soft ecc. I noticed that when I
reboot
> > the system and the filesystem gets mounted I get errors. The
more
> > writes that occur the more errors I seem to get. I ran a test for
a
> > week or so over the break which generated log files. A reboot
after
> > this produced thousands of errors but the filesystem seemed
ok.
> >
> > The errors are something like
> >
> > Empty flash at 0x00469ffcb ends at 0x0046a000
> This happens due to NAND specific timed buffer flushing. JFFS2
fills
> up the write buffer to a full page boundary with 0xff and writes out
> the buffer to the chip, if you have no consecutive write within 2
> seconds. This is done to ensure, that data is written to FLASH.
This
> fill looks like empty FLASH on mount. So JFFS2 is wondering
why there
> is data after the "empty" FLASH. No reason to worry.
>
> > or
> >
> > jffs2_scan_dirent_node(): Node CRC failed on node at
0x0046a7f0 read
> > 0xffffffff calculated 0xdec8161b
> This happens, if the write buffer is not written to FLASH before
you
> power down your system without umount. Then the write buffer is
lost
> and you get this error on mount. This indicates, that you may
have
> lost data.
>
> > I was wondering if any of you could shed any light on this.
>
> --
> Thomas
>
________________________________________________________
______________
> __ linutronix - competence in embedded & realtime linux
> http://www.linutronix.de mail: tglx at linutronix.de
>
>
>
______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
__________________________
Simon Haynes - Baydel
Phone : 44 (0) 1372 378811
Email : simon at baydel.com
__________________________
More information about the linux-mtd
mailing list