ubifs error at boot "bad node type" again
Maxim Kochetkov
fido_max at inbox.ru
Fri Jul 16 00:10:48 PDT 2021
Hi, Richard, All!
I faced with corruption of UBIFS partition.
I have a small 16Mb UBIFS on mtd partition on SPI NOR flash.
This partition contain some my own binary applications:
drwxrw-rw- 2 197608 197121 1.2K Jan 10 23:29 .
drwxr-xr-x 19 root root 0 Jan 10 23:28 ..
-rwxrw-rw- 1 197608 197121 652.1K Jun 22 2021 dmr_ccl.elf
-rwxrw-rw- 1 197608 197121 942.6K Jun 22 2021 dmr_core.elf
-rwxrw-rw- 1 197608 197121 162.7K Jun 22 2021 dmr_fdrcontrol.elf
-rwxrw-rw- 1 197608 197121 42.2K Jun 22 2021 dmr_monitor.elf
-rwxrw-rw- 1 197608 197121 1.2M Jun 22 2021 dmr_pdsi.elf
-rwxrw-rw- 1 197608 197121 235.2K Jun 22 2021 dmr_registrator.elf
-rwxrw-rw- 1 197608 197121 832.1K Jun 22 2021 dmr_sip.elf
-rwxrw-rw- 1 197608 197121 243.6K Jun 22 2021 dmr_sma44d.elf
-rwxrw-rw- 1 197608 197121 98.6K Jun 22 2021 dmr_smtuts.elf
-rwxrw-rw- 1 197608 197121 1.2M Jun 22 2021 dmr_trcontrol.elf
-rwxrw-rw- 1 197608 197121 411.2K Jun 22 2021 dmr_trunk.elf
-rwxrw-rw- 1 197608 197121 449.5K Jun 22 2021 dsp.prg
-rwxrw-rw- 1 197608 197121 509 Jun 22 2021 startup.sh
-rwxrw-rw- 1 197608 197121 40.9K Jun 22 2021 svcmon
-rwxrw-rw- 1 197608 197121 1.1K Jun 22 2021 svcmon.conf
svcmon is runnig in background, kicking wathdog and
start/restart/monitor other *.elf programms.
In usual way everything works fine. The problem starts with software update.
My current update scenario is:
1. Kill all my *.elf process but keep svcmon
2. Remove all *.elf and svcmon from storage
3. Write new *.elf and svcmon to storage (with sync after write)
4. svcmon starts new *.elf binaries
From this point UBIFS can't be remounted to read-only because old
svcmon is still running.
So at reboot this partition can't be unmounted/remounted as read-only.
After reboot sometimes I get UBIFS corruption by two scenarios:
1. Lost some files
----------------------------------------------------------------
[ 85.978345] UBIFS error (ubi1:0 pid 258): 0xc01a5df4: bad node type
(0 but expected 1)
[ 86.002402] UBIFS error (ubi1:0 pid 258): 0xc01a5e1c: bad node at LEB
17:17352, LEB mapping status 1
[ 86.030010] Not a node, first 24 bytes:
[ 86.030076] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 ........................
[ 86.080359] CPU: 0 PID: 258 Comm: dmr_core.elf Tainted: G C
5.13.0.3 #3
[ 86.104302] Hardware name: Generic DA850/OMAP-L138/AM18x
[ 86.120378] Function entered at [<c000de24>] from [<c000c11c>]
[ 86.138057] Function entered at [<c000c11c>] from [<c01a5e3c>]
[ 86.155707] Function entered at [<c01a5e3c>] from [<c01be63c>]
[ 86.173346] Function entered at [<c01be63c>] from [<c01a8848>]
[ 86.190979] Function entered at [<c01a8848>] from [<c019c19c>]
[ 86.208613] Function entered at [<c019c19c>] from [<c019c7fc>]
[ 86.226240] Function entered at [<c019c7fc>] from [<c006e674>]
[ 86.243870] Function entered at [<c006e674>] from [<c006f72c>]
[ 86.261500] Function entered at [<c006f72c>] from [<c00879c8>]
[ 86.279125] Function entered at [<c00879c8>] from [<c008a30c>]
[ 86.296764] Function entered at [<c008a30c>] from [<c000e444>]
[ 86.314397] Function entered at [<c000e444>] from [<c000e6e4>]
[ 86.332033] Function entered at [<c000e6e4>] from [<c0009444>]
[ 86.349651] Exception stack(0xc3f17fb0 to 0xc3f17ff8)
[ 86.364975] 7fa0: 00135bf0
00010000 bed57e4c 00000900
[ 86.389653] 7fc0: 0010a000 00135bf0 00000015 00000001 bed57e44
bed57e4c b6fb7fbc 00000000
[ 86.414317] 7fe0: b6903f70 bed57cb8 00066d58 000b66b0 80000010 ffffffff
-----------------------------------------------------------------
In this case some of *.elf binaries is corrupted and cannot be even
read/copied, failed with IO error.
2. Totally corrupted partition
-----------------------------------------------------------------
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.836316] ubi0: default
fastmap pool size: 8
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.836407] ubi0: default
fastmap WL pool size: 4
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.836446] ubi0:
attaching mtd3
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.908650] ubi0: scanning
is finished
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942161] ubi0: attached
mtd3 (name "CONFIG", size 4 MiB)
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942254] ubi0: PEB
size: 65536 bytes (64 KiB), LEB size: 65408 bytes
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942305] ubi0:
min./max. I/O unit sizes: 1/256, sub-page size 1
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942338] ubi0: VID
header offset: 64 (aligned 64), data offset: 128
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942371] ubi0: good
PEBs: 64, bad PEBs: 0, corrupted PEBs: 0
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942403] ubi0: user
volume: 1, internal volumes: 1, max. volumes count: 128
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942435] ubi0: max/mean
erase counter: 8/3, WL threshold: 4096, image sequence number: 2547763139
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.942477] ubi0:
available PEBs: 0, total reserved PEBs: 64, PEBs reserved for bad PEB
handling: 0
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.945775] ubi0:
background thread "ubi_bgt0d" started, PID 101
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.968871] ubi1: default
fastmap pool size: 10
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.968965] ubi1: default
fastmap WL pool size: 5
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 7.969004] ubi1:
attaching mtd4
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.250958] ubi1: scanning
is finished
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285144] ubi1: attached
mtd4 (name "BOOT", size 16 MiB)
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285240] ubi1: PEB
size: 65536 bytes (64 KiB), LEB size: 65408 bytes
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285287] ubi1:
min./max. I/O unit sizes: 1/256, sub-page size 1
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285324] ubi1: VID
header offset: 64 (aligned 64), data offset: 128
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285358] ubi1: good
PEBs: 256, bad PEBs: 0, corrupted PEBs: 0
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285391] ubi1: user
volume: 1, internal volumes: 1, max. volumes count: 128
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285424] ubi1: max/mean
erase counter: 51/37, WL threshold: 4096, image sequence number: 374788538
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.285466] ubi1:
available PEBs: 0, total reserved PEBs: 256, PEBs reserved for bad PEB
handling: 0
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.292788] ubi1:
background thread "ubi_bgt1d" started, PID 103
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.312014] ubi2: default
fastmap pool size: 60
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.312106] ubi2: default
fastmap WL pool size: 30
Jan 10 07:44:52 RMU-H3 user.notice kernel: [ 8.312143] ubi2:
attaching mtd7
Jan 10 07:44:53 RMU-H3 user.notice kernel: [ 8.909857] random: crng
init done
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.620589] ubi2: scanning
is finished
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658660] ubi2: attached
mtd7 (name "OTHER", size 75 MiB)
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658755] ubi2: PEB
size: 65536 bytes (64 KiB), LEB size: 65408 bytes
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658803] ubi2:
min./max. I/O unit sizes: 1/256, sub-page size 1
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658839] ubi2: VID
header offset: 64 (aligned 64), data offset: 128
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658872] ubi2: good
PEBs: 1200, bad PEBs: 0, corrupted PEBs: 0
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658903] ubi2: user
volume: 1, internal volumes: 1, max. volumes count: 128
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658936] ubi2: max/mean
erase counter: 13/8, WL threshold: 4096, image sequence number: 4095774474
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.658977] ubi2:
available PEBs: 0, total reserved PEBs: 1200, PEBs reserved for bad PEB
handling: 0
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.662356] ubi2:
background thread "ubi_bgt2d" started, PID 105
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.871311] UBIFS
(ubi0:0): Mounting in unauthenticated mode
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 9.872271] UBIFS
(ubi0:0): background thread "ubifs_bgt0_0" started, PID 114
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.042857] UBIFS
(ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "config"
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.042960] UBIFS
(ubi0:0): LEB size: 65408 bytes (63 KiB), min./max. I/O unit sizes: 8
bytes/256 bytes
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.043021] UBIFS
(ubi0:0): FS size: 3139584 bytes (2 MiB, 48 LEBs), max 58 LEBs, journal
size 523265 bytes (0 MiB, 6 LEBs)
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.043078] UBIFS
(ubi0:0): reserved for root: 148290 bytes (144 KiB)
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.043114] UBIFS
(ubi0:0): media format: w5/r0 (latest is w5/r0), UUID
B656412D-624A-4796-AC59-D679298A592B, small LPT model
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.255899] UBIFS
(ubi1:0): Mounting in unauthenticated mode
Jan 10 07:44:54 RMU-H3 user.notice kernel: [ 10.258067] UBIFS
(ubi1:0): background thread "ubifs_bgt1_0" started, PID 123
Jan 10 07:44:55 RMU-H3 user.notice kernel: [ 10.343017] UBIFS
(ubi1:0): recovery needed
Jan 10 07:44:55 RMU-H3 user.err kernel: [ 10.929408] UBIFS error
(ubi1:0 pid 122): 0xc01ba8a8: corruption 0
Jan 10 07:44:55 RMU-H3 user.err kernel: [ 10.929519] UBIFS error
(ubi1:0 pid 122): 0xc01ab57c: corruption at LEB 223:23152
Jan 10 07:44:55 RMU-H3 user.err kernel: [ 10.929578] UBIFS error
(ubi1:0 pid 122): 0xc01ab5a4: first 8192 bytes from LEB 223:23152
Jan 10 07:44:55 RMU-H3 user.debug kernel: [ 10.929666] 00000000:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
................................
Jan 10 07:44:55 RMU-H3 user.debug kernel: [ 10.929732] 00000020:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
................................
Jan 10 07:44:55 RMU-H3 user.debug kernel: [ 10.929790] 00000040:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
................................
Jan 10 07:44:55 RMU-H3 user.debug kernel: [ 10.929846] 00000060:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
................................
---- all zeroes --------
Jan 10 07:44:55 RMU-H3 user.debug kernel: [ 10.958370] 00001fe0:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
................................
Jan 10 07:44:55 RMU-H3 user.err kernel: [ 10.958404] UBIFS error
(ubi1:0 pid 122): 0xc01ba5e8: LEB 223 scanning failed
Jan 10 07:44:55 RMU-H3 user.notice kernel: [ 10.958725] UBIFS
(ubi1:0): background thread "ubifs_bgt1_0" stops
--- failed to mount -> run ubiupdatevol -t ubi1_0 -----------
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.011277] UBIFS
(ubi1:0): default file-system created
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.013982] UBIFS
(ubi1:0): Mounting in unauthenticated mode
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.014936] UBIFS
(ubi1:0): background thread "ubifs_bgt1_0" started, PID 129
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.150290] UBIFS
(ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "boot"
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.150396] UBIFS
(ubi1:0): LEB size: 65408 bytes (63 KiB), min./max. I/O unit sizes: 8
bytes/256 bytes
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.150454] UBIFS
(ubi1:0): FS size: 15697920 bytes (14 MiB, 240 LEBs), max 250 LEBs,
journal size 784896 bytes (0 MiB, 12 LEBs)
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.150512] UBIFS
(ubi1:0): reserved for root: 741451 bytes (724 KiB)
Jan 10 07:45:12 RMU-H3 user.notice kernel: [ 28.150546] UBIFS
(ubi1:0): media format: w5/r0 (latest is w5/r0), UUID
D7A58512-0AD3-413F-AF45-304398E9EFA8, small LPT model
--------------------------------------------------------------
In this case partition can't be even mounted. The only option is format
this partition.
I understand that I should stop svcmon and all my other application
before reboot/unmount partition.
But power cut can happens before I stop them all.
I guess this problem may be related to:
http://lists.infradead.org/pipermail/linux-mtd/2014-July/054561.html
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/fs/ubifs?h=next-20210716&id=4ab25ac8b2b5514151d5f91cf9514df08dd26938
More information about the linux-mtd
mailing list