mtd: raw: nand: gpmi-nand data corruption @ v5.10.184

Kegl Rohit keglrohit at gmail.com
Wed Jun 21 07:27:04 PDT 2023


Hello!

Using imx7d and rt stable kernel tree.

After upgrading to v5.10.184-rt90 the rootfs ubifs mtd partition got corrupted.
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/tag/?h=v5.10.184-rt90

After reverting the latest patch
(e4e4b24b42e710db058cc2a79a7cf16bf02b4915), the rootfs partition did
not get corrupted.
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c?h=v5.10.184-rt90&id=e4e4b24b42e710db058cc2a79a7cf16bf02b4915

The commit message states the timeout calculation was changed.
Here are the calculated timeouts `busy_timeout_cycles` before (_old)
and after the patch (_new):

[    0.491534] busy_timeout_cycles_old 4353
[    0.491604] busy_timeout_cycles_new 1424705
[    0.492300] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xdc
[    0.492310] nand: Macronix MX30LF4G28AC
[    0.492316] nand: 512 MiB, SLC, erase size: 128 KiB, page size:
2048, OOB size: 112
[    0.492488] busy_timeout_cycles_old 4353
[    0.492493] busy_timeout_cycles_new 1424705
[    0.492863] busy_timeout_cycles_old 2510
[    0.492872] busy_timeout_cycles_new 350000

The new timeouts are set a lot higher. Higher timeouts should not be
an issue. Lower timeouts could be an issue.
But because of this high timeouts gpmi-nand is broken for us.

For now we simple reverted the change.
The new calculations seem to be flaky, a previous "fix backport" was
already reverted because of data corruption.
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c?h=v5.10.184-rt90&id=cc5ee0e0eed0bec2b7cc1d0feb9405e884eace7d

Any guesses why the high timeout causes issues?


Thanks in advance!



More information about the linux-mtd mailing list