[bug report] mtd: bad block counter inflated when repeatedly marking the same block

Wang Zhaolong wangzhaolong at huaweicloud.com
Mon Sep 1 07:16:17 PDT 2025


>> Possible fixes (high level)
>> - Core-side conservative fix (minimal ABI change):
>>    * In mtd_block_markbad(), probe _block_isbad(master, ofs) before
>>      calling _block_markbad(), and (if available) probe again after
>> success.
> 
> Sounds reasonable to me.
> 
>>    * Only increment ecc_stats.badblocks if the state transitioned from
>>      “good” to “bad”.
>>
>> - Teach *_block_markbad() to return a distinct positive code for
>>    “already bad” vs “newly marked”, so the core can increment only on
>>    “newly marked”.
> 
> The subsystems have no other way to tell rather than probing the state
> of the marker. I believe it is best to do it in mtd_block_markbad() in
> the common mtdcore.c rather than in each implementation.
> 

Thank you for your guidance!

>> What I want to know is:
>> - Would the core-side pre/post _block_isbad check be acceptable as a short-term fix?
>> - Any objections regarding the extra isbad IO in the markbad path?
> 
> Fine by me, it's not a hot path. If we end up here, we are already doing
> damage control.;

I’ll prepare an RFC patch implementing the pre/post _block_isbad() checks
in mtd_block_markbad() and only increment ecc_stats.badblocks on a confirmed
good→bad transition. As discussed, this keeps the logic in mtdcore and
avoids per-driver changes.

For devices lacking _block_isbad(), I’ll take the conservative approach
(no increment unless we can confirm the transition) and call that out in the
commit message to gather feedback.

> If you mean userspace API, I'd say not necessarily. Querying the stats
> is probably the way to go. However in the kernel, while I would in
> theory not be opposed to it, I don't see how one could implement that in
> a more efficient way than probing the marker as discussed above in a
> complex-free manner.
> 

It sounds very reasonable.

I will include reproduction methods and test cases based on both nandsim and
physical devices. Thank you again for your guidance!

Best regards,
Wang Zhaolong




More information about the linux-mtd mailing list