[PATCH] mtd: gpmi: Deal with bitflips in erased regions

Boris Brezillon boris.brezillon at free-electrons.com
Mon Apr 11 04:54:46 PDT 2016


Hi Stefan,

Sorry for the late reply.

On Tue, 15 Mar 2016 09:35:37 +0100
Stefan Christ <s.christ at phytec.de> wrote:

> From: Elie De Brauwer <eliedebrauwer at gmail.com>
> 
> The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
> able to correct bitflips on data actually streamed through the block.
> When erasing a block the data does not stream through the BCH block
> and therefore no ECC data is written to the NAND chip. This causes
> gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
> found in an erased page. Typically causing problems at higher levels
> (ubifs corrupted empty space warnings). This problem was also observed
> when using SLC NAND devices.
> 
> This patch configures the BCH block to mark a block as 'erased' if
> not too much bitflips are found (by setting the erase threshold). A
> consequence of this is that whenever an erased page is read, the
> number of bitflips will be counted and corrected in software,
> allowing the upper layers to take proper actions.
> 
> Signed-off-by: Elie De Brauwer <eliedebrauwer at gmail.com>
> Acked-by: Peter Korsgaard <peter at korsgaard.com>
> Acked-by: Huang Shijie <b32955 at freescale.com>
> ---
> Hi,
> 
> I'm reposting this patch, because it's still not in the mainline kernel tree.
> It was taken from
> 
>     http://lists.infradead.org/pipermail/linux-mtd/2014-January/051357.html
>     [PATCH v7] mtd: gpmi: Deal with bitflips in erased regions
> 
> The patch fixes the UBI corrupted empty space problem:
> 
>     UBIFS error (pid 74): ubifs_scan: corrupt empty space at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: corruption at LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scanned_corruption: first 8192 bytes from LEB 3059:72260
>     UBIFS error (pid 74): ubifs_scan: LEB 3059 scanning failed
> 
> which occurs in real systems sometimes.
> 
> I've tested the patch.  You can add my
> 
>     Tested-by: Stefan Christ <s.christ at phytec.de>
> 
> Mit freundlichen Grüßen / Kind regards,
>         Stefan Christ
> 
> ---
>  drivers/mtd/nand/gpmi-nand/bch-regs.h  |  1 +
>  drivers/mtd/nand/gpmi-nand/gpmi-lib.c  |  9 +++++++
>  drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 43 +++++++++++++++++++++++++++++++---
>  3 files changed, 50 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/nand/gpmi-nand/bch-regs.h b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> index 05bb91f..88cc89d 100644
> --- a/drivers/mtd/nand/gpmi-nand/bch-regs.h
> +++ b/drivers/mtd/nand/gpmi-nand/bch-regs.h
> @@ -31,6 +31,7 @@
>  
>  #define HW_BCH_STATUS0				0x00000010
>  #define HW_BCH_MODE				0x00000020
> +#define BM_BCH_MODE_ERASE_THRESHOLD_MASK	0xff
>  #define HW_BCH_ENCODEPTR			0x00000030
>  #define HW_BCH_DATAPTR				0x00000040
>  #define HW_BCH_METAPTR				0x00000050
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> index 0f68a99..a724dcb 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
> @@ -256,6 +256,7 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  	unsigned int ecc_strength;
>  	unsigned int page_size;
>  	unsigned int gf_len;
> +	unsigned int erase_threshold;
>  	int ret;
>  
>  	if (common_nfc_set_geometry(this))
> @@ -298,6 +299,14 @@ int bch_set_geometry(struct gpmi_nand_data *this)
>  			| BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
>  			r->bch_regs + HW_BCH_FLASH0LAYOUT1);
>  
> +	/* Set the tolerance for bitflips when reading erased blocks. */
> +	erase_threshold = gf_len / 2;
> +	if (erase_threshold > bch_geo->ecc_strength)
> +		erase_threshold = bch_geo->ecc_strength;

Can we make it consistent with other drivers and always use ecc_strength
instead of gf_len / 2?

> +
> +	writel(erase_threshold & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
> +		r->bch_regs + HW_BCH_MODE);
> +
>  	/* Set *all* chip selects to use layout 0. */
>  	writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
>  
> diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> index 235ddcb..ba852e8 100644
> --- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> +++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
> @@ -991,6 +991,30 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
>  	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
>  }
>  
> +/*
> + * Count the number of 0 bits in a supposed to be
> + * erased region and correct them. Return the number
> + * of bitflips or zero when the region was correct.
> + */
> +static unsigned int erased_sector_bitflips(unsigned char *data,
> +					unsigned int chunk,
> +					struct bch_geometry *geo)
> +{
> +	unsigned int flip_bits = 0;
> +	int i;
> +	int base = geo->ecc_chunk_size * chunk;
> +
> +	/* Count bitflips */
> +	for (i = 0; i < geo->ecc_chunk_size; i++)
> +		flip_bits += hweight8(~data[base + i]);

Hm, the number of bitflips in this case is not always correct: you're
only counting bitflips in the data section and are completely ignoring
ECC bytes.

If you don't have direct access to ECC bytes, you should read them and
count the number of bitflips in there too.

> +
> +	/* Correct bitflips by 0xFF'ing this chunk. */
> +	if (flip_bits)
> +		memset(&data[base], 0xFF, geo->ecc_chunk_size);
> +
> +	return flip_bits;
> +}
> +

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com



More information about the linux-mtd mailing list