[RFC PATCH v2] mtd: nand: add erased-page bitflip correction

Gupta, Pekon pekon at ti.com
Tue Mar 18 02:23:10 EDT 2014


Hi Brain,


>From: Brian Norris [mailto:computersforpeace at gmail.com]
>>On Fri, Mar 14, 2014 at 06:58:20PM +0530, Pekon Gupta wrote:
>>> From: Brian Norris <computersforpeace at gmail.com>
>>
>>
>> *changes v1 -> v2*
>> Tried optimizing for performance by making following updates
>>  - *read_data and *read_oob passed as arguments to nand_verify_erased_page()
>>    to avoid re-reading the page from device chip->ecc.read_page_raw()
>
>This is not valid for all drivers. See the 'oob_required' parameter to
>read_page(). This means that if we can't verify for sure that it's
>uncorrectable without looking at the OOB, we have to re-read to include
>the OOB.
>
Ok.. But then your v1 patch does not, work as-it-is with OMAP NAND driver.
I think the problem is same as that of GPMI driver, there is difference in
Page layout when using ecc.read_page_raw() and ecc.read_page().


>>  - exit nand_verify_erased_page() as-soon-as bitflips > ecc.strength
>
>Thanks, that's probably good.
>
>>  - using hweightN() only for non-0xff data
>
>I'm not convinced that's a great optimization. I would write the simpler
>version without the 0xff check, then add it if it really makes sense.
>hweightN() is actually a pretty simple computation, and adding extra
>branches might be more harmful than helpful.
>
Yes, I'll try to get number to check this.


>>  - if erased-page is detected then clean only read_data
>>    ?? need to see if OOB cleaning affects file-system metadata like JFFS2 cleanmarker
>
>Nak. We must clean both OOB and data buffers if we are doing this
>"correction". We don't put FS-specific hacks here. I don't know the
>specifics of JFFS2 cleanmarkers, but JFFS2 should have valid ECC bytes
>present whenever it expects a (non-raw) read to retain a few 0 bits. Or
>else it should be using a raw read mode.
>
This is important, we cannot just ignore it.
JFFS2 writes a signature just after erase in oobfree[] area [a], this helps
to identify that block-erase was completed successfully. Now there are two
issues here:

(1) If you count complete spare area (which includes JFFS2 cleanmarker) then
you would always see more lot of bit-flips > ecc.strength.

(2) If after detection of erased-page if you memset(OOB, oob, mtd->oobsize)
Then you are clearing off this JFFS2 marker also, and the JFFS2 FS *may* crib
(however, I think JFFS2 uses chip->ecc.read_oob_raw() to read OOB, so it
 Should not be a problem).
- Therefore in my earlier patch [2] I was just counting bit-flips in read_ecc[]
  *not* read_oob[].
- Also, 'oob_required=1' may not be used by user-space utilities, unless 
   in specific  conditions like 'nanddump' || 'nandread --raw'.

 [...]

>> +			else {
>>  				ret = chip->ecc.read_page(mtd, chip, bufpoi,
>>  							  oob_required, page);
>> +				/* Check if its an erased-page with bit-flips */
>> +				if ((chip->options & NAND_ECC_NEED_CHECK_FF) &&
>> +					mtd->ecc_stats.failed - ecc_failures) {
>> +					ret = nand_verify_erased_page(mtd, chip,
>> +							bufpoi, chip->oob_poi);
>
>You have corrupted my patch so that now it does not handle the subpage
>case.
>
I think handling sub-page reads should be handled separately, because using
chip->ecc.read_page_raw() when sub-page is enabled, may not be a good.
 
[...]

>
>Anyway, I have already heard comments from several people (including
>you), but you have only addressed some of them in your version of my
>patch. Please, let me send my own patches, and you can provide your
>feedback there.
>
Yes for sure.. I was just trying to show the changes which worked for
OMAP NAND. I had anyways reverted the changes, NAKed by you,
before taking the performance numbers.
However, lets first get this working for all other drivers as well.
(And please do pay attention to JFFS2 clean-marker query).


[a] http://www.linux-mtd.infradead.org/faq/jffs2.html#L_clmarker

with regards, pekon



More information about the linux-mtd mailing list