[bug report] mtd: bad block counter inflated when repeatedly marking the same block
Miquel Raynal
miquel.raynal at bootlin.com
Mon Sep 1 05:46:34 PDT 2025
Hello,
On 01/09/2025 at 17:26:45 +08, Wang Zhaolong <wangzhaolong at huaweicloud.com> wrote:
> Hi all,
>
> I’d like to report a mismatch between bad-block statistics and actual
> on-flash state when repeatedly calling MEMSETBADBLOCK on the same
> eraseblock.
>
> Summary
> - Repeatedly marking the same block bad (e.g., 5 times) makes
> /sys/class/mtd/mtdX/bad_blocks increase by 5.
> - After reboot, the statistical value ture to the correct value of 1.
> - So the runtime counter (ecc_stats.badblocks) is inflated.
>
> Repro (with nandsim.ko)
>
> ```bash
> # ID="0xec,0xa1,0x00,0x15" # 128M 128KB 2KB
> # modprobe nandsim id_bytes=$ID
> # ~/mtd-utils/mtd_markbad /dev/mtd1 10 1 # Repeat 5 times
> ......
> # ~/mtd-utils/mtd_markbad /dev/mtd1 10 1
>
> # -- It can be observed that 5 bad blocks will appear in the statistical information.
> # cat /sys/class/mtd/mtd1/bad_blocks
> 5
>
> # -- In fact, we can only scan 1 bad block.
> # ubiformat -v /dev/mtd1 | grep "bad eraseblock"
> ubiformat: 1 bad eraseblocks found, numbers: 10
> ```
>
> Root cause analysis (kernel-side)
>
> ```
> mtd_block_markbad
> mtd->_block_markbad()
> nand_block_markbad
> ret = nand_block_isbad
> return 0; // ret > 0
> mtd->ecc_stats.badblocks++; // No bad blocks was marked but was counted.
> Relevant code
> - drivers/mtd/nand/raw/nand_base.c:nand_block_markbad()
> - drivers/mtd/mtdcore.c:mtd_block_markbad()
> ```
>
> nand_block_markbad() returns 0 both for “newly marked” and “already bad”.
> mtdcore cannot tell whether this call actually added a new bad block,
> but still increments ecc_stats.badblocks.
>
> Possible fixes (high level)
> - Core-side conservative fix (minimal ABI change):
> * In mtd_block_markbad(), probe _block_isbad(master, ofs) before
> calling _block_markbad(), and (if available) probe again after
> success.
Sounds reasonable to me.
> * Only increment ecc_stats.badblocks if the state transitioned from
> “good” to “bad”.
>
> - Teach *_block_markbad() to return a distinct positive code for
> “already bad” vs “newly marked”, so the core can increment only on
> “newly marked”.
The subsystems have no other way to tell rather than probing the state
of the marker. I believe it is best to do it in mtd_block_markbad() in
the common mtdcore.c rather than in each implementation.
> What I want to know is:
> - Would the core-side pre/post _block_isbad check be acceptable as a short-term fix?
> - Any objections regarding the extra isbad IO in the markbad path?
Fine by me, it's not a hot path. If we end up here, we are already doing
damage control.;
> - Longer-term, is there interest in an explicit API/return-code semantics
> to differentiate “already bad” vs “newly marked”?
If you mean userspace API, I'd say not necessarily. Querying the stats
is probably the way to go. However in the kernel, while I would in
theory not be opposed to it, I don't see how one could implement that in
a more efficient way than probing the marker as discussed above in a
complex-free manner.
> I’m very interested in helping resolve this issue and would be grateful
> for any guidance or suggestions.
MTD folks in Cc, your feedback is also welcome ;-)
Thanks,
Miquèl
More information about the linux-mtd
mailing list