UBIFS Corrupt during power failure

Eric Holmberg Eric_Holmberg at Trimble.com
Mon Mar 30 15:00:03 EDT 2009


Here is a basic summary of my findings to date for debugging corruption
of the root UBIFS volume which is located on NOR flash.  Please comment
if you have any suggestions.

Physical Power Cycling
----------------------
Physically cycling the power causes LEB empty block corruption after an
average of 50 power cycles.  The problem typically occurs when power
fails while UBIFS is doing a recovery from the previous power failure.

See attached log of boot showing the two defective LEB's (file is
2009-03-26-corrupt-LEB-empty-block--physical-power-cycling.txt).

Using reboot -f
---------------
Using Reboot -f, the system was rebooted between 0 and 30 seconds after
remounting the UBIFS partition for read/write access.

No corruption was seen, but after 4500 reboots, UBIFS ran out of empty
LEB's to save the index in recovery.c, function ubifs_rcvry_gc_commit.

	lnum = ubifs_find_free_leb_for_idx(c);
	if (lnum < 0) {
		dbg_err("could not find an empty LEB");
		return lnum;
	}

[42949374.870000] UBIFS: recovery needed
[42949387.320000] UBIFS: recovery deferred
[42949396.620000] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[42949396.630000] UBIFS: mounted read-only
[42949396.630000] UBIFS: file system size: 27498240 bytes (26853 KiB, 26
MiB, 210 LEBs)
[42949396.640000] UBIFS: journal size: 4059264 bytes (3964 KiB, 3 MiB,
31 LEBs)
[42949396.650000] UBIFS: default compressor: LZO
[42949396.650000] UBIFS: media format 4, latest format 4
[42949396.670000] VFS: Mounted root (ubifs filesystem) readonly.
[42949396.680000] Freeing init memory: 112K
...
[42949397.290000] UBIFS: completing deferred recovery
[42949397.300000] UBIFS error (pid 294): ubifs_rcvry_gc_commit: could
not find an empty LEB
mount: mounting ubi0:rootfs on / failed: No space left on device

This appears to be an issue caused by the deferred recovery.  To cause
this, I did the following:
  1. Set the kernel config to mount the UBIFS rootfs as read-only:
	rootfstype=ubifs root=ubi0:rootfs ro ubi.mtd=root mem=32M
console=ttyAT3,115200

  2. In the init script, remount the rootfs with read/write access:
	mount -o remount,rw /

  3. Force a reboot at a random point (which may be in the middle of a
recovery) by using the following reboot code:
	let "delay=$RANDOM * 20/32767"
	/sbin/reboot -d  $delay -f -n

Once the "could not find an empty LEB" error occurs, I can remove the
'ro' option from the kernel command line and the recovery completes
successfully and everything is fine.  The file system is using 57
percent of the available space:

/ # df -h
Filesystem                Size      Used Available Use% Mounted on
rootfs                   23.8M     13.6M     10.2M  57% /
ubi0:rootfs              23.8M     13.6M     10.2M  57% /
none                     14.2M         0     14.2M   0% /dev


Conclusion
----------
At this point, I am trying to isolate the cause of the corruption due to
power failures.  To determine if I have a hardware issue (we are using a
custom board), I am going to toggle the processor reset line instead of
cycling the power.  If that does not cause corruption, then it points to
a hardware issue with the NOR flash reset circuit.

I still have to solve the empty LEB issue when mounting the root file
system initially as read-only.  Any suggestions (such as maybe using an
initramfs for the early environment instead) are appreciated.


Here are the commands used to create the UBI volume for the 32MB NOR
flash with 256 sectors of 128KB each and a 16-bit data bus.  I have
reserved two 2MB partitions for two kernel images and the rest of the
blocks (224 blocks) are used by the UBI volume with a single UBIFS file
system.

	mkfs.ubifs --root=rfs_dir -o rootfs.ubifs -m 1 -e 130944 -c 256
	ubinize -o rootfs.ubi -s 1 -m 1 -p "128 KiB" ubi.ini

Ubi.ini Contents:

[ubifs]
mode=ubi
image=rootfs.ubifs
vol_id=0
# True size is 28MB, but this prevents the volume from being too big and
# it will resize to the maximum automatically on first mount
vol_size=24MiB
vol_type=dynamic
vol_name=rootfs
vol_flags=autoresize
vol_alignment=1
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 2009-03-26-corrupt-LEB-empty-block--physical-power-cycling.txt
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20090330/b4e2953a/attachment-0001.txt>


More information about the linux-mtd mailing list