[PATCH] mtd: gpmi: Deal with bitflips in erased regions regions
Gupta, Pekon
pekon at ti.com
Tue Dec 17 05:17:38 EST 2013
>From: Elie De Brauwer
>
>The BCH block typically used with a GPMI block on an i.MX28/i.MX6 is only
>able to correct bitflips on data actually streamed through the block.
>When erasing a block the data does not stream through the BCH block
>and therefore no ECC data is written to the NAND chip. This causes
>gpmi_ecc_read_page to return failure as soon as a single non-1-bit is
>found in an erased page. Typically causing problems at higher levels
>(ubifs corrupted empty space warnings). This problem was also observed
>when using SLC NAND devices.
>
>This patch configures the BCH block to mark a block as 'erased' if
>no more than gf_len/2 bitflips are found. Next HW_BCH_STATUS0:ALLONES
>is used to check if the data read were all ones, indicating a read of a
>properly erased chunk was performed. If this was not the case a slow path
>is entered where bitflips are counted and corrected in software,
>allowing the upper layers to take proper actions.
>
>Signed-off-by: Elie De Brauwer <eliedebrauwer at gmail.com>
>Acked-by: Peter Korsgaard <peter at korsgaard.com>
[...]
>--- a/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
>+++ b/drivers/mtd/nand/gpmi-nand/gpmi-lib.c
>@@ -286,6 +286,13 @@ int bch_set_geometry(struct gpmi_nand_data *this)
> | BF_BCH_FLASH0LAYOUT1_DATAN_SIZE(block_size, this),
> r->bch_regs + HW_BCH_FLASH0LAYOUT1);
>
>+ /*
>+ * Set the tolerance for bitflips when reading erased blocks
>+ * equal to half the gf_len.
>+ */
>+ writel((gf_len / 2) & BM_BCH_MODE_ERASE_THRESHOLD_MASK,
>+ r->bch_regs + HW_BCH_MODE);
>+
> /* Set *all* chip selects to use layout 0. */
> writel(0, r->bch_regs + HW_BCH_LAYOUTSELECT);
>
>@@ -1094,6 +1101,15 @@ int gpmi_is_ready(struct gpmi_nand_data *this, unsigned chip)
> return reg & mask;
> }
>
>+/* Returns 1 if the last transaction consisted only out of ones. */
>+int gpmi_all_ones(struct gpmi_nand_data *this)
>+{
>+ struct resources *r = &this->resources;
>+ uint32_t reg = readl(r->bch_regs + HW_BCH_STATUS0);
>+
>+ return reg & BM_BCH_STATUS0_ALLONES_MASK;
>+}
>+
>
Just query..
Is GPMI controller is able to detect number of 0s or 1s while reading
the raw data from NAND ?
I'm asking because a similar implementation was used in omap2.c
driver reference: drivers/mtd/nand/omap2.c: erased_sector_bitflips()
But, we ended up degrading the driver performance, so even I was
looking for alternative implementations..
[...]
>
>+/*
>+ * Count the number of 0 bits in a supposed to be
>+ * erased region and correct them. Return the number
>+ * of bitflips or zero when the region was correct.
>+ */
>+static unsigned int erased_sector_bitflips(unsigned char *data,
>+ unsigned int chunk,
>+ struct bch_geometry *geo)
>+{
>+ unsigned int flip_bits = 0;
>+ int i;
>+ int base = geo->ecc_chunk_size * chunk;
>+
>+ /* Count bitflips */
>+ for (i = 0; i < geo->ecc_chunk_size; i++)
>+ flip_bits += hweight8(~data[base + i]);
>+
>+ /* Correct bitflips by 0xFF'ing this chunk. */
>+ if (flip_bits)
>+ memset(&data[base], 0xFF, geo->ecc_chunk_size);
But I don't quite understand here is..
Why do you want to mask off the bit-flips in NAND, and give corrected
data to upper layers ? Won't this lead to accumulation of bit-flips ?
Actually UBIFS checks for clean erased-pages before writing anything
on it [1]. And if UBIFS finds an unclean page, it re-erases it without
harming the system to avoid accumulation of bit-flips [2].
(Artem, please correct me if I'm wrong here)..
But with your above patch you will actually fool UBIFS to write on
unclean pages, which will increase the probability of bit-flips failures.
Example: your erased page had 4-bit flips, but you still reported
data as clean. Now when UBIFS will write on this erased page, it
has the margin of tolerating only 4 more bit-flips.
Now, if you want to get rid of the debug messages and stack_dump
coming into your boot log because of this, then better suppress
debug message in following file
----------------------
diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
index bf79def..e589ff3 100644
--- a/drivers/mtd/ubi/io.c
+++ b/drivers/mtd/ubi/io.c
@@ -1430,7 +1430,9 @@ fail:
print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 32, 1, buf, len, 1);
err = -EINVAL;
error:
+#ifdef DEBUG
dump_stack();
+#endif
vfree(buf);
return err;
}
----------------------
[1] drivers/mtd/ubi/io.c
@@ ubi_io_write(...)
err = ubi_self_check_all_ff( ... )
[2] drivers/mtd/ubi/wl.c
@@ wear_leveling_worker( ... )
/UBI_IO_FF_BITFLIPS
with regards, pekon
More information about the linux-mtd
mailing list