ubi : kernel panic on erroneous block
Matthieu CASTET
matthieu.castet at parrot.com
Mon Aug 23 09:30:00 EDT 2010
Hi Artem,
Artem Bityutskiy a écrit :
> On Tue, 2010-08-10 at 11:56 +0200, Matthieu CASTET wrote:
>> Hi,
>>
>
> Matthieu, unfortunately I'm on holidays so cannot really look at this.
> And I already have a lot of UBI/UBIFS issues waiting for me to look at.
> I think I'll start looking at the things only in mid-September/October.
> Sorry for this. But may be Adrian could take a look at this, if he has
> some time? :-)
I don't know if you returned from holidays, but as you post stuff on ML
it will post further investigation.
I have done more test on these flash and I got other failures.
The problem seems in the handling of interrupted write. On some nand we
use, the page becomes instable and read can return unstable values. The
manufacturer told us we should not use page where write was interrupted,
they should have a erase cycle before they can be used again.
On mounting, for the page where write was interrupted by a power cut :
- I saw ecc error, in these case ubifs should reject it in recovery
handling and everything should be fine.
- I saw correctable error, in this case ubi move the block unless the
next read in copy_page return an ecc error. In case of ecc error in copy
we saw it too late, ubifs recovery is already done.
- in this case ubifs recover can reject it if the data is not ok (bad
crc, ...). Note that in these case we did the scrubbing move for nothing.
- I saw page that return correct data (ecc and crc ok), but later they
return (un)correctable error. Again this is too late [1], recovery is
already done.
It seems ubi/ubifs doesn't identify interrupted write pages on
scanning/mount ATM. It only relies on ecc/crc, but this is not enough
for unstable page. They can be good (or 1 bit error) for one read and
bad the next read.
So the problem is to identify interrupted write pages on scanning/mount.
For static volume it should be easy with the interrupted flags.
There is the tricky case of data move (for wear leveling or scrubbing) :
if sqnum of the copy is the biggest, we should ignore it/copy it.
But for dynamic/ubifs that's an other story. May be using ubi sqnum +
ubifs journal it should be possible to do something.
Matthieu
PS : the same story happen for erase, but ubi should handle them correctly.
[1]
[ 12.720244] UBIFS: un-mount UBI device 3, volume 0
[ 12.760056] UBIFS: mounted UBI device 3, volume 0, name "system"
[ 12.765919] UBIFS: file system size: 30601216 bytes (29884 KiB, 29
MiB, 241 LEBs)
[ 12.773642] UBIFS: journal size: 1523712 bytes (1488 KiB, 1
MiB, 12 LEBs)
[ 12.780868] UBIFS: media format: w4/r0 (latest is w4/r0)
[ 12.786668] UBIFS: default compressor: none
[ 12.790852] UBIFS: reserved for root: 1445370 bytes (1411 KiB)
writing file '//mnt/dir06/file0046.bin' num=70, size=147120
writing file '//mnt/dir0c/file006c.bin' num=108, size=288146
[ 13.491407] UBI error: ubi_io_read: error -74 while reading 60 bytes
from PEB 106:129480, read 60 bytes
[ 13.500785] [<c00279f0>] (dump_stack+0x0/0x14) from [<c0161040>]
(ubi_io_read+0xf0/0x258)
[ 13.508952] [<c0160f50>] (ubi_io_read+0x0/0x258) from [<c01603a0>]
(ubi_eba_read_leb+0x1b4/0x490)
[ 13.517791] [<c01601ec>] (ubi_eba_read_leb+0x0/0x490) from
[<c015e3f0>] (ubi_leb_read+0xe8/0x138)
[ 13.526649] [<c015e308>] (ubi_leb_read+0x0/0x138) from [<c00d0c48>]
(ubifs_read_node+0x40/0x190)
[ 13.535423] r7:00000002 r6:00000000 r5:c78489a0 r4:c78489a0
[ 13.541065] [<c00d0c08>] (ubifs_read_node+0x0/0x190) from
[<c00d18b8>] (ubifs_read_node_wbuf+0x4c/0x204)
[ 13.550547] [<c00d186c>] (ubifs_read_node_wbuf+0x0/0x204) from
[<c00e6b60>] (ubifs_tnc_read_node+0x5c/0xf8)
[ 13.560274] [<c00e6b04>] (ubifs_tnc_read_node+0x0/0xf8) from
[<c00d32a8>] (matches_name+0x94/0xdc)
[ 13.569218] [<c00d3214>] (matches_name+0x0/0xdc) from [<c00d3334>]
(resolve_collision+0x44/0x204)
[ 13.578074] [<c00d32f0>] (resolve_collision+0x0/0x204) from
[<c00d45e4>] (ubifs_tnc_remove_nm+0xf0/0x108)
[ 13.587615] [<c00d44f4>] (ubifs_tnc_remove_nm+0x0/0x108) from
[<c00c7f08>] (ubifs_jnl_rename+0x4f8/0x70c)
[ 13.597169] [<c00c7a10>] (ubifs_jnl_rename+0x0/0x70c) from
[<c00caaf8>] (ubifs_rename+0x2b0/0x5e4)
[ 13.606117] [<c00ca848>] (ubifs_rename+0x0/0x5e4) from [<c008581c>]
(vfs_rename+0x238/0x270)
[ 13.614538] [<c00855e4>] (vfs_rename+0x0/0x270) from [<c0086e54>]
(sys_renameat+0x1b8/0x1cc)
[ 13.622965] [<c0086c9c>] (sys_renameat+0x0/0x1cc) from [<c0086e8c>]
(sys_rename+0x24/0x28)
[ 13.631213] [<c0086e68>] (sys_rename+0x0/0x28) from [<c0023c00>]
(ret_fast_syscall+0x0/0x2c)
[ 13.639670] UBIFS error (pid 273): ubifs_read_node: bad node type (0
but expected 2)
[ 13.647371] UBIFS error (pid 273): ubifs_read_node: bad node at LEB
47:125384
[ 13.654514] UBIFS warning (pid 273): ubifs_ro_mode: switched to
read-only mode, error -22
/endurance: endurance.c: 197: create_file: Assertion `status == 0' failed.
[ 46.357586] UBIFS error (pid 101): make_reservation: cannot reserve
160 bytes in jhead 1, error -30
[ 46.366503] UBIFS error (pid 101): ubifs_write_inode: can't write
inode 19507, error -30
More information about the linux-mtd
mailing list