[PATCH] mtd: nand: default bitflip-reporting threshold to 75% of correction strength

Brian Norris computersforpeace at gmail.com
Tue Jan 13 10:48:05 PST 2015


Hi Richard,

On Tue, Jan 13, 2015 at 02:25:30PM +0100, Richard Weinberger wrote:
> Am 12.01.2015 um 21:51 schrieb Brian Norris:
> > The MTD API reports -EUCLEAN only if the maximum number of bitflips
> > found in any ECC block exceeds a certain threshold. This is done to
> > avoid excessive -EUCLEAN reports to MTD users, which may induce
> > additional scrubbing of data, even when the ECC algorithm in use is
> > perfectly capable of handling the bitflips.
> > 
> > This threshold can be controlled by user-space (via sysfs), to allow
> > users to determine what they are willing to tolerate in their
> > application. But it still helps to have sane defaults.
> > 
> > In recent discussion [1], it was pointed out that our default threshold
> > is equal to the correction strength. That means that we won't actually
> > report any -EUCLEAN (i.e., "bitflips were corrected") errors until there
> > are almost too many to handle. It was determined that 3/4 of the
> > correction strength is probably a better default.
> > 
> > [1] http://lists.infradead.org/pipermail/linux-mtd/2015-January/057259.html
> 
> I like this change but I have one question.
> 
> UBI will treat a block as bad if it shows bitflips (EUCLEAN) right
> after erasure.

Can you elaborate? When "after erasure"? The closest I see is that UBI
will mark a block bad if it sees an -EIO failure from sync_erase() in
erase_worker(). If you have extra debug checks on, then
ubi_self_check_all_ff() could potentially give you bitflip problems
after the erase, but that's an odd corner case anyway, which many
drivers have been handling in hacked together ad-hoc ways anyway (search
for "bitflips in erase pages").

So I can't pinpoint what you're talking about, exactly.

> For SLC NAND this works very well.
> Does this also hold for MLC NAND? If one or two bit flips are okay
> even for a freshly erased MLC NAND this change could cause UBI to
> mark good blocks as bad depending on the ECC strength.

I would typically assume that MLC NAND users must be using significantly
stronger ECC (e.g., 12-bit / 512-byte, at least), so "one or two
bitflips" would still fall well under 75% of 12 bits.

Brian



More information about the linux-mtd mailing list