[PATCH v6 3/6] mtd: nand: omap: ecc.correct: omap_elm_correct_data: fix erased-page bit-flip correction for H/W ECC schemes

Brian Norris computersforpeace at gmail.com
Tue Jan 14 12:37:13 EST 2014


Hi Pekon,

On Mon, Jan 13, 2014 at 08:05:54PM -0800, Brian Norris wrote:
> On Sat, Jan 04, 2014 at 08:18:15AM +0530, Pekon Gupta wrote:
> >  - Irrespective of number of bit-flips an erased-page will only contain
> >    all(0xff), So its safe to return error-code -EUCLEAN with data_buf=0xff,
> >    instead of -EBADMSG.
> 
> Are you saying that all bitflips in erased pages should yield -EUCLEAN?
> I agree that they shouldn't return -EBADMSG (up to the strength
> threshold), but I also think that we should still be able to report the
> number of bitflips "corrected" in our erased page handling. That way,
> pages with small numbers of bitflips can still be corrected.
> 
> Put another way: what if every page starts to experience at least one
> bitflip? Do you want UBIFS to scrub the page every time? Rather, I think
> you want to calculate the proper count so that MTD can mask the bitflips
> if they are under the threshold. See my comment labeled [***] in the
> patch context below.

I totally forgot to put the [***] below when I got to it! See below (for
real this time!)

[...]
> > --- a/drivers/mtd/nand/omap2.c
> > +++ b/drivers/mtd/nand/omap2.c
> > @@ -1436,19 +1397,35 @@ static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
[...]
> > +				switch (ecc_opt) {
> > +				case OMAP_ECC_BCH4_CODE_HW:
> > +					if (memcmp(calc_ecc, bch4_vector,
> > +							 actual_eccbytes))
> > +						stat += eccstrength;

The [***] goes here!

[***] I don't think you can assume 'eccstrength' number of bitflips
here. This sidesteps the bitflip threshold accounting, and it means that
even a single bitflip will cause the entire block to be scrubbed, even
if it was just erased. The key point here is that eventually, we can't
assume that scrubbing will remove *all* bit errors in NAND flash. An
erased page is allowed to have a small number of bitflips or stuck bits.

> > +					break;
> > +				case OMAP_ECC_BCH8_CODE_HW:
> > +					if (memcmp(calc_ecc, bch8_vector,
> > +							 actual_eccbytes))
> > +						stat += eccstrength;

[***] Same here.

(BTW, this switch/case block can be reduced to a single case if you
select bch4_vector vs. bch8_vector in the switch/case from earlier in
this function.)

> > +					break;
> > +				default:
> > +					return -EINVAL;
> >  				}
> > +				memset(&data[i * eccsize], 0xff, eccsize);
> >  			}
> >  		}
> >  

Brian



More information about the linux-mtd mailing list