"corrupt empty space" error on boot?!?
Steve deRosier
derosier at gmail.com
Mon Mar 2 08:39:59 PST 2015
Hi All,
So, after torturing one of our devices by rebooting it for a few
hundred iterations, we ran across a situation where the system fails
to boot due to a corrupt empty space error:
Starting kernel ...
Uncompressing Linux... done, booting the kernel.
UBIFS error (pid 1): ubifs_scan: corrupt empty space at LEB 4:3918
UBIFS error (pid 1): ubifs_scanned_corruption: corruption at LEB 4:398
UBIFS error (pid 1): ubifs_scanned_corruption: first 8192 bytes from 8
UBIFS error (pid 1): ubifs_scan: LEB 4 scanning failed
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-b)
This is on kernel v3.8, atmel_nand diver. In earlier discussions, it
was suggested that the driver would encounter this sort of problem
because the driver/chip can't do ECC in erased pages so a bitflip
there could be an issue. This is the first time I've seen this
problem in the wild though.
1. Is this likely what I'm seeing?
2. Will moving to a recent kernel help (we're currently updating our
mainline to bleeding-edge 4.0)?
3. How can I programmatically recover from this situation?
Logically, it seems to me that a non ecc protected bit-flip in an
empty page should be a non-issue. UBI should be able to move the
block, erase the block, torture/return-to-service and move on with
it's life. No data is destroyed or even affected.
A unit not mounting the rootfs because of a bit-flip in _empty_space_
is unacceptable to us, so I've got to figure out a way to deal with
this rare event.
Any help would be appreciated.
Thanks,
- Steve
More information about the linux-mtd
mailing list