Problems mounting dirty UBIFS on NOR flash with cadence QSPI controller
Ian Abbott
abbotti at mev.co.uk
Wed Aug 28 07:05:22 PDT 2024
On 22/08/2024 14:15, Ian Abbott wrote:
> Hello,
>
> I'm currently investigating some UBIFS mount failures on an embedded
> system under development.
>
> The system is based on an Altera Cyclone V SoCFPGA (armv7l) platform,
> currently running kernel 6.6.36 (with a few patches for runtime DT
> overlay support and one or two unrelated driver bug fixes). This uses a
> Cadence QSPI controller (compatible="intel,socfpga-qspi", driver =
> spi-cadence-quadspi, quirks = CQSPI_DISABLE_DAC_MODE |
> CQSPI_NO_SUPPORT_WR_COMPLETION | CQSPI_SLOW_SRAM). The SPI NOR flash
> device is a Cypress (Spansion) S25FL256S1. The flash device has been
> partitioned with a single UBI partition filling the device, and contains
> a single UBIFS volume.
>
> % mtdinfo -a
> Count of MTD devices: 1
> Present MTD devices: mtd0
> Sysfs interface supported: yes
>
> mtd0
> Name: ubi
> Type: nor
> Eraseblock size: 65536 bytes, 64.0 KiB
> Amount of eraseblocks: 512 (33554432 bytes, 32.0 MiB)
> Minimum input/output unit size: 1 byte
> Sub-page size: 1 byte
> Character device major/minor: 90:0
> Bad blocks are allowed: false
> Device is writable: true
>
> The UBIFS volume started off mostly empty, but was deliberately dirtied
> by creating a file over and over again with the same name, and 4096
> bytes of random data, and then cutting power to the system.
>
> When the UBIFS volume is in this dirty state, attempts to mount the
> dirty UBIFS volume result in timeouts, and further attempt to access the
> MTD device result in further timeouts, even after removing the
> spi-cadence-quadspi and spi-nor kernel modules and reloading them. A
> complete reboot gets the controllers back in a sensible state, but the
> problems reoccur when attempting to mount the still dirty UBIFS volume
> again.
>
> Here are some kernel logs with debugging turned on for the ubi, ubifs,
> and spi-nor modules. I have removed the initial part of the logs until
> the interesting bit where it recovers a corrupted LEB and erases a PEB
> while reading some other LEBs. In the logs below, pid 430 is the mount
> process, and pid 167 is the background thread for the UBI device:
>
> -------------------------------[8<]----------------------------------
> [ 317.968185] UBIFS DBG mnt (pid 430): replay bud LEB 13, head 2, offs
> 80, is_last 1
> [ 317.968210] UBIFS DBG rcvry (pid 430): 13:80, jhead 2, grouped 1
> [ 317.968227] UBIFS DBG scan (pid 430): scan LEB 13:80
> [ 317.968244] UBI DBG gen (pid 430): read 65328 bytes from LEB 0:13:80
> [ 317.968263] UBI DBG eba (pid 430): read 65328 bytes from offset 80 of
> LEB 0:13, PEB 149
> [ 317.968284] UBI DBG io (pid 430): read 65328 bytes from PEB 149:208
> [ 317.968311] spi-nor spi2.0: from 0x009500d0, len 65328
> [ 317.979912] UBIFS DBG scan (pid 430): look at LEB 13:80 (65328 bytes
> left)
> [ 317.979946] UBIFS DBG scan (pid 430): hit empty space at LEB 13:80
> [ 317.979961] UBIFS DBG rcvry (pid 430): found corruption (-1) at 13:80
> [ 317.980398] UBIFS DBG rcvry (pid 430): cleaning corruption at 13:80
> [ 317.980434] UBIFS DBG scan (pid 430): stop scanning LEB 13 at offset 80
> [ 317.980448] UBIFS DBG rcvry (pid 430): fixing LEB 13 start 80 endpt 80
> [ 317.980464] UBI DBG gen (pid 430): read 80 bytes from LEB 0:13:0
> [ 317.980485] UBI DBG eba (pid 430): read 80 bytes from offset 0 of LEB
> 0:13, PEB 149
> [ 317.980503] UBI DBG io (pid 430): read 80 bytes from PEB 149:128
> [ 317.980528] spi-nor spi2.0: from 0x00950080, len 80
> [ 317.980702] UBI DBG gen (pid 430): atomically write 80 bytes to LEB 0:13
> [ 317.980743] UBI DBG eba (pid 430): change LEB 0:13
> [ 317.980768] UBI DBG wl (pid 430): PEB 150 EC 1
> [ 317.980782] UBI DBG wl (pid 430): added PEB 150 EC 1 to the
> protection queue
> [ 317.980797] UBI DBG eba (pid 430): write VID hdr and 80 bytes at
> offset 0 of LEB 0:13, PEB 150
> [ 317.980814] UBI DBG io (pid 430): write VID header to PEB 150
> [ 317.980830] UBI DBG io (pid 430): write 64 bytes to PEB 150:64
> [ 317.980851] spi-nor spi2.0: to 0x00960040, len 64
> [ 317.981916] UBI DBG io (pid 430): write 80 bytes to PEB 150:128
> [ 317.981956] spi-nor spi2.0: to 0x00960080, len 80
> [ 317.983099] UBI DBG wl (pid 430): PEB 149
> [ 317.983125] UBI DBG wl (pid 430): schedule erasure of PEB 149, EC 1,
> torture 0
> [ 317.983173] UBIFS DBG mnt (pid 430): bud LEB 13 replied: dirty 0,
> free 65328
> [ 317.983193] UBIFS DBG mnt (pid 430): replay bud LEB 14, head 1, offs
> 0, is_last 0
> [ 317.983210] UBIFS DBG scan (pid 430): scan LEB 14:0
> [ 317.983226] UBI DBG gen (pid 430): read 65408 bytes from LEB 0:14:0
> [ 317.983244] UBI DBG eba (pid 430): read 65408 bytes from offset 0 of
> LEB 0:14, PEB 142
> [ 317.983263] UBI DBG io (pid 430): read 65408 bytes from PEB 142:128
> [ 317.983282] spi-nor spi2.0: from 0x008e0080, len 65408
> [ 317.983437] UBI DBG wl (pid 167): erase PEB 149 EC 1 LEB 0:13
> [ 317.983461] UBI DBG wl (pid 167): erase PEB 149, old EC 1
> [ 317.983482] UBI DBG io (pid 167): read EC header from PEB 149
> [ 317.983496] UBI DBG io (pid 167): read 64 bytes from PEB 149:0
> [ 317.983515] spi-nor spi2.0: from 0x00950000, len 64
> [ 318.000130] UBIFS DBG scan (pid 430): look at LEB 14:19872 (45536
> bytes left)
> [ 318.000143] UBIFS DBG scan (pid 430): scanning inode node at LEB
> 14:19872
> [ 318.000159] UBIFS DBG scan (pid 430): look at LEB 14:20032 (45376
> bytes left)
> [ 318.000172] UBIFS DBG scan (pid 430): scanning truncate node at LEB
> 14:20032
>
> etc.
>
> [ 318.012452] UBIFS DBG scan (pid 430): look at LEB 14:65176 (232 bytes
> left)
> [ 318.012468] UBIFS DBG scan (pid 430): scanning truncate node at LEB
> 14:65176
> [ 318.012483] UBIFS DBG scan (pid 430): look at LEB 14:65232 (176 bytes
> left)
> [ 318.012496] UBIFS DBG scan (pid 430): hit empty space at LEB 14:65232
> [ 318.012509] UBIFS DBG scan (pid 430): stop scanning LEB 14 at offset
> 65232
> [ 318.012530] UBIFS DBG mnt (pid 430): add LEB 14:0, key (94, inode)
> [ 318.012550] UBIFS DBG mnt (pid 430): add LEB 14:160, key (94, truncate)
> [ 318.012566] UBIFS DBG mnt (pid 430): add LEB 14:216, key (94, inode)
> [ 318.012582] UBIFS DBG mnt (pid 430): add LEB 14:376, key (94, truncate)
>
> etc.
>
> [ 318.031260] UBIFS DBG mnt (pid 430): add LEB 14:64960, key (94,
> truncate)
> [ 318.031275] UBIFS DBG mnt (pid 430): add LEB 14:65016, key (94, inode)
> [ 318.031290] UBIFS DBG mnt (pid 430): add LEB 14:65176, key (94,
> truncate)
> [ 318.031307] UBIFS DBG mnt (pid 430): bud LEB 14 replied: dirty 16912,
> free 176
> [ 318.031623] UBI DBG gen (pid 430): read 4 bytes from LEB 0:16:0
> [ 318.031648] UBI DBG eba (pid 430): read 4 bytes from offset 0 of LEB
> 0:16, PEB 144
> [ 318.031668] UBI DBG io (pid 430): read 4 bytes from PEB 144:128
> [ 318.031692] spi-nor spi2.0: from 0x00900080, len 4
> [ 318.097506] UBI DBG wl (pid 167): erased PEB 149, new EC 2
> [ 318.097539] UBI DBG io (pid 167): write EC header to PEB 149
> [ 318.097560] UBI DBG io (pid 167): write 64 bytes to PEB 149:0
> [ 318.097584] spi-nor spi2.0: to 0x00950000, len 64
> [ 318.598001] cadence-qspi ff705000.spi: Flash command execution timed
> out.
> [ 318.604848] spi-nor spi2.0: operation failed with -110
> [ 318.609990] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 4 bytes from PEB 144:128, read only 0 bytes, retry
> [ 318.610305] spi-nor spi2.0: from 0x00900080, len 4
> [ 319.113362] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 319.120596] spi-nor spi2.0: operation failed with -110
> [ 319.125779] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 4 bytes from PEB 144:128, read only 0 bytes, retry
> [ 319.126186] spi-nor spi2.0: from 0x00900080, len 4
> [ 319.633343] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 319.640579] spi-nor spi2.0: operation failed with -110
> [ 319.645760] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 4 bytes from PEB 144:128, read only 0 bytes, retry
> [ 319.646223] spi-nor spi2.0: from 0x00900080, len 4
> [ 320.153334] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 320.160565] spi-nor spi2.0: operation failed with -110
> [ 320.165756] ubi0 error: ubi_io_read [ubi]: error -110 while reading 4
> bytes from PEB 144:128, read 0 bytes
> [ 320.175721] CPU: 1 PID: 430 Comm: mount Tainted: G O
> 6.6.36-its-ipt4v3 #1
> [ 320.175747] Hardware name: Altera SOCFPGA
> [ 320.175767] unwind_backtrace from show_stack+0x18/0x1c
> [ 320.175810] show_stack from dump_stack_lvl+0x40/0x4c
> [ 320.175836] dump_stack_lvl from ubi_io_read+0x160/0x37c [ubi]
> [ 320.176106] ubi_io_read [ubi] from ubi_eba_read_leb+0xa4/0x438 [ubi]
> [ 320.176541] ubi_eba_read_leb [ubi] from ubi_leb_read+0x88/0xf4 [ubi]
> [ 320.176986] ubi_leb_read [ubi] from ubifs_leb_read+0x34/0x80 [ubifs]
> [ 320.177701] ubifs_leb_read [ubifs] from
> ubifs_replay_journal+0xda4/0x1644 [ubifs]
> [ 320.178531] ubifs_replay_journal [ubifs] from
> ubifs_mount+0x4cc/0x1cc0 [ubifs]
> [ 320.179370] ubifs_mount [ubifs] from legacy_get_tree+0x2c/0x54
> [ 320.179800] legacy_get_tree from vfs_get_tree+0x2c/0x114
> [ 320.179835] vfs_get_tree from vfs_cmd_create+0x64/0xd0
> [ 320.179865] vfs_cmd_create from sys_fsconfig+0x474/0x52c
> [ 320.179895] sys_fsconfig from ret_fast_syscall+0x0/0x1c
> [ 320.179920] Exception stack(0xf4761fa8 to 0xf4761ff0)
> [ 320.179939] 1fa0: 00000000 00ec9eb0 00000003
> 00000006 00000000 00000000
> [ 320.179957] 1fc0: 00000000 00ec9eb0 00020000 000001af b6f57e3c
> b6f5877c b6f58000 00ec968c
> [ 320.179971] 1fe0: b6f58384 befc6a50 b6f3887b b6ec559a
> [ 320.179992] UBIFS error (ubi0:0 pid 430): ubifs_leb_read [ubifs]:
> reading 4 bytes from LEB 16:0 failed, error -110
> [ 320.190823] CPU: 1 PID: 430 Comm: mount Tainted: G O
> 6.6.36-its-ipt4v3 #1
> [ 320.190852] Hardware name: Altera SOCFPGA
> [ 320.190868] unwind_backtrace from show_stack+0x18/0x1c
> [ 320.190909] show_stack from dump_stack_lvl+0x40/0x4c
> [ 320.190934] dump_stack_lvl from ubifs_leb_read+0x7c/0x80 [ubifs]
> [ 320.191397] ubifs_leb_read [ubifs] from
> ubifs_replay_journal+0xda4/0x1644 [ubifs]
> [ 320.192210] ubifs_replay_journal [ubifs] from
> ubifs_mount+0x4cc/0x1cc0 [ubifs]
> [ 320.193033] ubifs_mount [ubifs] from legacy_get_tree+0x2c/0x54
> [ 320.193478] legacy_get_tree from vfs_get_tree+0x2c/0x114
> [ 320.193513] vfs_get_tree from vfs_cmd_create+0x64/0xd0
> [ 320.193543] vfs_cmd_create from sys_fsconfig+0x474/0x52c
> [ 320.193572] sys_fsconfig from ret_fast_syscall+0x0/0x1c
> [ 320.193597] Exception stack(0xf4761fa8 to 0xf4761ff0)
> [ 320.193615] 1fa0: 00000000 00ec9eb0 00000003
> 00000006 00000000 00000000
> [ 320.193634] 1fc0: 00000000 00ec9eb0 00020000 000001af b6f57e3c
> b6f5877c b6f58000 00ec968c
> [ 320.193647] 1fe0: b6f58384 befc6a50 b6f3887b b6ec559a
> [ 320.193749] UBIFS DBG mnt (pid 430): replay bud LEB 17, head 1, offs
> 0, is_last 0
> [ 320.193778] UBIFS DBG scan (pid 430): scan LEB 17:0
> [ 320.193797] UBI DBG gen (pid 430): read 65408 bytes from LEB 0:17:0
> [ 320.193817] UBI DBG eba (pid 430): read 65408 bytes from offset 0 of
> LEB 0:17, PEB 143
> [ 320.193836] UBI DBG io (pid 430): read 65408 bytes from PEB 143:128
> [ 320.193859] spi-nor spi2.0: from 0x008f0080, len 65408
> [ 320.703328] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 320.710556] spi-nor spi2.0: operation failed with -110
> [ 320.715761] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 65408 bytes from PEB 143:128, read only 0 bytes, retry
> [ 320.718370] spi-nor spi2.0: from 0x008f0080, len 65408
> [ 321.223330] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 321.230573] spi-nor spi2.0: operation failed with -110
> [ 321.235748] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 65408 bytes from PEB 143:128, read only 0 bytes, retry
> [ 321.236289] spi-nor spi2.0: from 0x008f0080, len 65408
> [ 321.743324] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 321.750555] spi-nor spi2.0: operation failed with -110
> [ 321.755732] ubi0 warning: ubi_io_read [ubi]: error -110 while reading
> 65408 bytes from PEB 143:128, read only 0 bytes, retry
> [ 321.756179] spi-nor spi2.0: from 0x008f0080, len 65408
> [ 322.263322] cadence-qspi ff705000.spi: QSPI is still busy after 500ms
> timeout.
> [ 322.270550] spi-nor spi2.0: operation failed with -110
> [ 322.275720] ubi0 error: ubi_io_read [ubi]: error -110 while reading
> 65408 bytes from PEB 143:128, read 0 bytes
> [ 322.286026] CPU: 1 PID: 430 Comm: mount Tainted: G O
> 6.6.36-its-ipt4v3 #1
> [ 322.286050] Hardware name: Altera SOCFPGA
> [ 322.286067] unwind_backtrace from show_stack+0x18/0x1c
> [ 322.286111] show_stack from dump_stack_lvl+0x40/0x4c
> [ 322.286138] dump_stack_lvl from ubi_io_read+0x160/0x37c [ubi]
> [ 322.286409] ubi_io_read [ubi] from ubi_eba_read_leb+0xa4/0x438 [ubi]
> [ 322.286846] ubi_eba_read_leb [ubi] from ubi_leb_read+0x88/0xf4 [ubi]
> [ 322.287280] ubi_leb_read [ubi] from ubifs_leb_read+0x34/0x80 [ubifs]
> [ 322.287975] ubifs_leb_read [ubifs] from ubifs_start_scan+0x80/0x104
> [ubifs]
> [ 322.288787] ubifs_start_scan [ubifs] from ubifs_scan+0x2c/0x2e0 [ubifs]
> [ 322.289591] ubifs_scan [ubifs] from
> ubifs_replay_journal+0x3fc/0x1644 [ubifs]
> [ 322.290395] ubifs_replay_journal [ubifs] from
> ubifs_mount+0x4cc/0x1cc0 [ubifs]
> [ 322.291206] ubifs_mount [ubifs] from legacy_get_tree+0x2c/0x54
> [ 322.291635] legacy_get_tree from vfs_get_tree+0x2c/0x114
> [ 322.291670] vfs_get_tree from vfs_cmd_create+0x64/0xd0
> [ 322.291700] vfs_cmd_create from sys_fsconfig+0x474/0x52c
> [ 322.291729] sys_fsconfig from ret_fast_syscall+0x0/0x1c
> [ 322.291754] Exception stack(0xf4761fa8 to 0xf4761ff0)
> [ 322.291773] 1fa0: 00000000 00ec9eb0 00000003
> 00000006 00000000 00000000
> [ 322.291791] 1fc0: 00000000 00ec9eb0 00020000 000001af b6f57e3c
> b6f5877c b6f58000 00ec968c
> [ 322.291805] 1fe0: b6f58384 befc6a50 b6f3887b b6ec559a
> [ 322.291825] UBIFS error (ubi0:0 pid 430): ubifs_leb_read [ubifs]:
> reading 65408 bytes from LEB 17:0 failed, error -110
> [ 322.302953] CPU: 1 PID: 430 Comm: mount Tainted: G O
> 6.6.36-its-ipt4v3 #1
> [ 322.302977] Hardware name: Altera SOCFPGA
> [ 322.302990] unwind_backtrace from show_stack+0x18/0x1c
> [ 322.303028] show_stack from dump_stack_lvl+0x40/0x4c
> [ 322.303053] dump_stack_lvl from ubifs_leb_read+0x7c/0x80 [ubifs]
> [ 322.303507] ubifs_leb_read [ubifs] from ubifs_start_scan+0x80/0x104
> [ubifs]
> [ 322.304335] ubifs_start_scan [ubifs] from ubifs_scan+0x2c/0x2e0 [ubifs]
> [ 322.305156] ubifs_scan [ubifs] from
> ubifs_replay_journal+0x3fc/0x1644 [ubifs]
> [ 322.305962] ubifs_replay_journal [ubifs] from
> ubifs_mount+0x4cc/0x1cc0 [ubifs]
> [ 322.306765] ubifs_mount [ubifs] from legacy_get_tree+0x2c/0x54
> [ 322.307194] legacy_get_tree from vfs_get_tree+0x2c/0x114
> [ 322.307229] vfs_get_tree from vfs_cmd_create+0x64/0xd0
> [ 322.307258] vfs_cmd_create from sys_fsconfig+0x474/0x52c
> [ 322.307287] sys_fsconfig from ret_fast_syscall+0x0/0x1c
> [ 322.307313] Exception stack(0xf4761fa8 to 0xf4761ff0)
> [ 322.307332] 1fa0: 00000000 00ec9eb0 00000003
> 00000006 00000000 00000000
> [ 322.307350] 1fc0: 00000000 00ec9eb0 00020000 000001af b6f57e3c
> b6f5877c b6f58000 00ec968c
> [ 322.307364] 1fe0: b6f58384 befc6a50 b6f3887b b6ec559a
> [ 322.307443] UBIFS error (ubi0:0 pid 430): ubifs_start_scan [ubifs]:
> cannot read 65408 bytes from LEB 17:0, error -110
> [ 322.319829] UBIFS (ubi0:0): background thread "ubifs_bgt0_0" stops
> [ 322.320037] UBI DBG gen (pid 430): close device 0, volume 0, mode 2
> [ 322.320746] UBI DBG gen (pid 430): close device 0, volume 0, mode 1
> -------------------------------[8<]----------------------------------
>
> Some interesting points in the log:
>
> * 317.979961 where the mount/scanning process finds corruption in LEB 13
> (PEB 149) is found and LEB 13 is recovered and reassigned to PEB 150,
> and PE 149 is scheduled for erasure;
> * 317.983437 where the UBI background thread begins examining PEB 149
> for erasure;
> * 318.031623 where the mount/scanning process begins reading LEB 16 (PEB
> 144), reading 4 bytes from offset 0x00900080;
> * 318.097506 where the UBI background thread has erased PEB 149 and is
> writing a new header;
> * 318.598001 where the cadence-qspi controller times out the last Flash
> command and the SPI-NOR read operations start to fail.
>
> My next step is to dig deeper into the spi-cadence-quadspi and spi-nor
> drivers. I don't really know what's going on, but perhaps there is some
> sort of race condition between the erase and the read or the read and
> the write.
>
Just following up...
I found out that the Cadence QSPI flash command that timed out at
timestamp 318.598001 was 0x0cbb0400 which is SPINOR_OP_READ_FAST_4B
operation reading 4 bytes with an address length of 4 and 8 dummy cycles.
As a workaround, I have deleted the "m25p,fast-read" property and have
reduced the "spi-max-frequency" property value from 100000000 to
50000000 (which is the maximum supported frequency for non-fast reads).
It now mounts dirty UBIFS volumes successfully, at the expense of
reduced bandwidth.
I'm interested in any suggestions for a better workaround, if anyone has
any.
Ian
--
-=( Ian Abbott <abbotti at mev.co.uk> || MEV Ltd. is a company )=-
-=( registered in England & Wales. Regd. number: 02862268. )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-
More information about the linux-mtd
mailing list