[PATCH v2] mtd: nand: Add driver for M-sys / Sandisk diskonchip G4

Ivan Djelic ivan.djelic at parrot.com
Thu Nov 3 04:58:24 EDT 2011


On Wed, Nov 02, 2011 at 11:00:59PM +0000, Mike Dunn wrote:

Hi Mike,
A few comments below:

> +
> +/* value generated by the HW ecc generator upon reading blank page */
> +static uint8_t blank_read_hwecc[7] = {
> +       0xcf, 0x72, 0xfc, 0x1b, 0xa9, 0xc7, 0xb9 };

Using this blank ecc value to detect blank pages after reading them is not that
reliable, because if a bitflip appears in an erased page, the HW ecc generator
will generate a completely different sequence of bytes. Your driver will think
the page is not blank, it will try to correct errors and fail. And UBIFS will
not appreciate that.

I can see two cleaner alternatives to solve this issue:

1. When you program a page, before writing hwecc to oob, adjust it like this:

   hwecc[i] ^= blank_read_hwecc[i]^0xff;

The effect of this is that you now get 0xffs as ecc for blank pages, and bitflip
correction on erased pages for free. This is transparent to your controller,
because this "adjustment" cancels itself upon reading when calc_ecc^recv_ecc is
computed.

2. Use unprotected spare oob byte 15 as a programmming marker: remove it from
the oob_free list, and force it to 0 when you program a page. Now, you can
easily detect if a page is blank or has been programmed by checking this byte.
You can for instance count the number of bits set to 1 in the byte, and decide
it is blank if that number is greater than 4; this ensures you are robust to
bitflips in the marker byte itself.

My preference would go to option 2 in your case.

> +
> +       /* fix the errors */
> +       fixederrs = 0;
> +       for (i = 0; i < numerrs; i++) {
> +               if (errpos[i] > DOCG4_USERDATA_LEN * 8)
> +                       continue; /* error in ecc oob bytes; ignore */
> +               change_bit(errpos[i], (unsigned long *)buf);
> +               fixederrs++;
> +       }

Here, your check on errorpos[i] to skip ecc oob locations is correct; but a
bitflip (in ecc bytes or elsewhere) is an indication that the nand page needs
scrubbing (you don't want bitflips to accumulate), so you should increase
fixederrs in that case also.

If you see a _lot_ of scrubbing operations from UBI, you may also consider
concealing bitflip corrections; i.e. report them only if the number of corrected
bits is above a certain threshold (e.g. >= 3 in your case).

BR,
--
Ivan



More information about the linux-mtd mailing list