[PATCH v6] mtd: gpmi: Deal with bitflips in erased regions regions

Elie De Brauwer eliedebrauwer at gmail.com
Fri Jan 3 04:43:41 EST 2014


On Thu, Dec 19, 2013 at 6:10 AM, Huang Shijie <b32955 at freescale.com> wrote:
> On Wed, Dec 18, 2013 at 08:07:12PM +0100, Elie De Brauwer wrote:
>> The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
>> able to correct bitflips on data actually streamed through the block.
>> When erasing a block the data does not stream through the BCH block
>> and therefore no ECC data is written to the NAND chip. This causes
>> gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
>> found in an erased page. Typically causing problems at higher levels
>> (ubifs corrupted empty space warnings). This problem was also observed
>> when using SLC NAND devices.
>>
>> This patch configures the BCH block to mark a block as 'erased' if
>> not too much bitflips are found. Next HW_BCH_STATUS0:ALLONES
>> is used to check if the data read were all ones, indicating a read of a
>> properly erased chunk was performed. If this was not the case a slow path
>> is entered where bitflips are counted and corrected in software,
>> allowing the upper layers to take proper actions.
>>
>> Signed-off-by: Elie De Brauwer <eliedebrauwer at gmail.com>
>> Acked-by: Peter Korsgaard <peter at korsgaard.com>
>
> thanks a lot!
>
> Acked-by: Huang Shijie <b32955 at freescale.com>
>

Hello Huang, all,

Huang, you suggested earlier after checking with your hw guys to make
use of the ALLONES bit
in the HW_BCH0_STATUS0 register, this looked like a nice solution
since this could introduce a fast
path. However the documentation did not mention how the ALLONES bit is
implemented, including whether
or not it takes the erase threshold into account. Obviously V6 of my
patch will only function if ALLONES function if _all bits are
physically one_ and fail to function if allones something like the AND
of all statusses of the chunk.

I've been given a board with another bitflip (jay). It propgates as
the following error:
"UBIFS error (pid 36): ubifs_recover_master_node: failed to recover master node"

And if I look at the data I see the LEB consists out of 55 valid
master nodes, but two pages after the last valid master node there is
a bitflip which causes in ubifs/recovery.c get_master_node() to
trigger:
                if (!is_empty(buf, len))
                        goto out_err;

When dumping the data read I get to see a bitflip:
[    4.094719] c8a32e30: ff ff ff fd ff ff ff ff ff ff ff ff ff ff ff
ff  ................

This leads me to believe that the ALLONES bit is actually not behaving
the way I think it was. Implying that it actually is taking the
ERASE_THRESHOLD into account. (I verified this by dumping the status
register which was 0xff10 and counting the bitflips which actually
showed allones to be set while a bitflip was found) Making the allones
bit more or less useless in our case. Hence I suggest to revert the
use of the allones bit to the, (somewhat slower when reading erased
blocks) solution which always checks the erased blocks in
gpmi_ecc_read_page():

if (*status == STATUS_ERASED) {
   /* Erased block with bitflips. */
   flips = erased_sector_bitflips(payload_virt, i, nfc_geo);
} else {
   /* BCH block corrected some errors for us. */
   flips = *status;
}

What do you think and/or could you verify with your hardware guys ?

Thanks
E.
-- 
Elie De Brauwer



More information about the linux-mtd mailing list