UBIFS volume corruption (bad node at LEB 0:0)

Artem Bityutskiy dedekind at infradead.org
Thu Jan 8 01:46:45 EST 2009


On Wed, 2009-01-07 at 23:13 -0500, David Bergeron wrote:
> Hello all,
> 
> I'm getting some sort of volume corruption problem with UBIFS after  
> doing
> rootfs updates using rsync.
> 
> I've cooked up a minimalist test trying to eliminate possible  
> interference.
> The following steps will trigger the corruption almost every time. No  
> errors
> or warnings are produced during this procedure, every step behaves as  
> expected:
> 
> boot kernel ubi.mtd=0 root=ubi0:rootfs rootfstype=ubifs ro init=/bin/ 
> bash
> 
> # mount -t proc none /proc
> # ifconfig ...
> # mount -o remount,rw,sync /
> # rsync -aHxvi --delete ... /
> # mount -o remount,ro /
> # reboot -d -f
> 
> When rebooting, the kernel fails to mount the rootfs with the  
> following error:
> 
> [   61.033142] UBIFS error (pid 1): ubifs_read_node: bad node type (11  
> but expected 6)
> [   61.040965] UBIFS error (pid 1): ubifs_read_node: bad node at LEB 0:0

Hmm, 11 is an orphan node, 6 is the superblock node. Indeed LEB 0 has to
contain superblock node and cannot contain orphans.

We do not think we tested UBIFS re-mounting well enough, so I would not
be surprised to see bugs there.

> I'm running a fresh 2.6.28.0. mtd-utils were built from git on  
> 2008-11-19.
> About 3 weeks ago I also tried with an mtd-2.6.git master branch  
> patched kernel
> to no avail.
> 
> Note that in my few attempts, when I do NOT remount read-only before  
> rebooting,
> the filesystem has so far remained functional (albeit being left  
> unclean) and it
> boots as expected. The error msg is always the same. A certain amount of
> filesystem changes is necessary to trigger the problem, simply  
> touching a file
> is not enough. I have not determined exactly what operations rsync  
> needs to
> perform to reach breaking point, but sometimes everything goes well.

Hmm, OK. I'll try to look at this and figure out what is going wrong.
What would help a lot is if I was able to reproduce this at my setup. So
you may help by sending a shell script which reproduces this issue, if
you can. And it is better to work with nandsim, because this is the tool
I use here
(http://www.linux-mtd.infradead.org/faq/nand.html#L_nand_nandsim)

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)




More information about the linux-mtd mailing list