UBIFS does not mount after powerfail

Manfred Spraul manfred at colorfullife.com
Thu Nov 23 14:03:28 PST 2017


Hi Richard,

I have now three datasets:
- no xattr, no FASTMAP:
The log consists of ~189.000 WRITE or ERASE commands.
-- with chk_fs: 30.000 images tested, all ok.

-- with chk_fs, when splitting large writes at PAGE_SIZE: 814 images 
tested, all ok.

--> no issues at all when not using xattr.

- ecryptfs with ecryptfs_xattr_metadata:
The log consists of ~188.000 WRITE or ERASE commands.

-- without chk_fs: 23.000 images tested, 5 not mountable images, all 5 
within garbage_collect_leb():

If I see it right, the root cause is always a node that crosses a page 
boundary:
the first half of the node is written, the 2nd half is not written, it 
is still 0xff.
These nodes cause CRC failures during scanning.
(perhaps: output of layout_in_empty_space(), writing to a erased LEB 
instead of changing a LEB not properly handled?)

-- with chk_fs: 795 images tested, 62 not mountable.
Obviously including the 5 above: chk_fs runs after recovery_completed, 
garbage_collect_leb() is run during recovery.

-- kill-orphaned-xattr, with chk_fs: 215 images tested, 156 not mountable.
Note: This is not worse than without the patch. There are long streams 
of images that fail during chk_fs, 200 images is not enough for good 
statistics.
And: I have not tested the same images as without the patch.

- ecryptfs with ecryptfs_xattr_metadata and with FASTMAP
The log consists of ~197.000 WRITE or ERASE commands.

21.000 images tested, 178 do not mount. all fail in chk_fs.
The failure is always something like this:
> [34802.217857] UBIFS error (ubi0:0 pid 25706): ubifs_read_node: bad 
> node at LEB 243:74672, LEB mapping status 0
> [34802.218965] Not a node, first 24 bytes:
> [34802.218969] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
> ff ff ff ff
I have not tested with chk_fastmap.
And: Unlike above, where I tested the last images, I have here tested 
the first 20k images, thus a more or less empty media.
The lower failure rate could be caused by that.

Did you have the time to look at the images?
If you need more images, or if I should test a patch, just ask.

I have uploaded the most interesting images to sourceforge.
https://sourceforge.net/projects/calculix-rpm/files/ubifs/

--
     Manfred



More information about the linux-mtd mailing list