[PATCH] mtd: brcmnand: Workaround false ECC uncorrectable errors

Wed Dec 2 14:22:57 PST 2015

On Wed, Dec 02, 2015 at 10:08:20PM +0100, Jonas Gorski wrote:
> On Wed, Dec 2, 2015 at 9:54 PM, Brian Norris
> <computersforpeace at gmail.com> wrote:
> > If that's really an issue (i.e., we have an implementation + data), I'm
> > sure we could add optimization to nand_check_erased_ecc_chunk() to
> > support the bitflip_threshold == 0 case.
> 
> Maybe I'm missing something, but wasn't the point of introducing
> nand_check_erased_ecc_chunk that bitflips in erased pages should be
> treated as bitflips corrected by the ecc, and therefore fixed up
> before passing the data further on? So having a theshold of 0 would be
> wrong / no protection at all, and could be quite destructive on MLC
> nand, where bitflips in erased pages are rather common.

Yes, that would be pretty destructive on MLC NAND. But there might be
cases where this is reasonable. For instance, we might have:

 (a) SLC NAND that specifies up to 1 bitflip per 512 bytes
 (b) a Hamming ECC engine that can correct 1 bitflip
 (c) said Hamming engine does not handle all-0xff pages correctly, nor
 does it handle near-0xff pages (e.g., with a single 0 bit) correctly

Now, we might (for instance) allow up to ecc_strength / 2 bitflips in
erased pages [1], as a general rule, but for the above case, that means we
don't allow any on Hamming ECC. Hence, bitflip_threshold = 1 / 2 = 0.

This is kind of a degenerate case, so we might consider supporting it
differently than we would for BCH. e.g., don't use
nand_check_erased_ecc_chunk() at all.

Brian

[1] We might do this because we can expect that if an unprogrammed page
has N bitflips, there might be even more than N bitflips after
programming the page.