[PATCH V4] mtd: gpmi: fix the bitflips for erased page
Huang Shijie
b32955 at freescale.com
Mon Jan 13 03:47:18 EST 2014
We may meet the bitflips in reading an erased page(contains all 0xFF),
this may causes the UBIFS corrupt, please see the log from Elie:
-----------------------------------------------------------------
[ 3.831323] UBI warning: ubi_io_read: error -74 (ECC error) while reading 16384 bytes from PEB 443:245760, read only 16384 bytes, retry
[ 3.845026] UBI warning: ubi_io_read: error -74 (ECC error) while reading 16384 bytes from PEB 443:245760, read only 16384 bytes, retry
[ 3.858710] UBI warning: ubi_io_read: error -74 (ECC error) while reading 16384 bytes from PEB 443:245760, read only 16384 bytes, retry
[ 3.872408] UBI error: ubi_io_read: error -74 (ECC error) while reading 16384 bytes from PEB 443:245760, read 16384 bytes
...
[ 4.011529] UBIFS error (pid 36): ubifs_recover_leb: corrupt empty space LEB 27:237568, corruption starts at 9815
[ 4.021897] UBIFS error (pid 36): ubifs_scanned_corruption: corruption at LEB 27:247383
[ 4.030000] UBIFS error (pid 36): ubifs_scanned_corruption: first 6569 bytes from LEB 27:247383
-----------------------------------------------------------------
This patch does a check for the uncorrectable failure in the following steps:
[0] set the threshold.
The threshold is set based on the truth:
"A single 0 bit will lead to gf_len(13 or 14) bits 0 after the BCH
do the ECC."
For the sake of safe, we will set the threshold with half the gf_len, and
do not make it bigger the ECC strength.
[1] count the bitflips of the current ECC chunk, assume it is N.
[2] if the (N <= threshold) is true, we continue to read out the page with
ECC disabled. and we count the bitflips again, assume it is N2.
(We read out the whole page, not just a chunk, this makes the check
more strictly, and make the code more simple.)
[3] if the (N2 <= threshold) is true again, we can regard this is a erased
page. This is because a real erased page is full of 0xFF(maybe also has
several bitflips), while a page contains the 0xFF data will definitely
has many bitflips in the ECC parity areas.
[4] if the [3] fails, we can regard this is a page filled with the '0xFF'
data.
Tested-by: Elie De Brauwer <eliedebrauwer at gmail.com>
Reported-by: Elie De Brauwer <eliedebrauwer at gmail.com>
Signed-off-by: Huang Shijie <b32955 at freescale.com>
---
v3 --> v4:
[1] update the mtd->ecc_stats.corrected when we detect an erased page.
[2] add the shortcut when counting bitflips for non-ECC buffer.
[3] add more comments about why read a whole page, but not a chunk.
---
drivers/mtd/nand/gpmi-nand/gpmi-nand.c | 58 ++++++++++++++++++++++++++++++++
1 files changed, 58 insertions(+), 0 deletions(-)
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index e2f5820..6b6e707 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -958,6 +958,60 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
}
+static bool gpmi_erased_check(struct gpmi_nand_data *this,
+ unsigned char *data, unsigned int chunk, int page,
+ unsigned int *max_bitflips)
+{
+ struct nand_chip *chip = &this->nand;
+ struct mtd_info *mtd = &this->mtd;
+ struct bch_geometry *geo = &this->bch_geometry;
+ int base = geo->ecc_chunk_size * chunk;
+ unsigned int flip_bits = 0, flip_bits_noecc = 0;
+ uint64_t *buf = (uint64_t *)this->data_buffer_dma;
+ unsigned int threshold;
+ int i;
+
+ threshold = geo->gf_len / 2;
+ if (threshold > geo->ecc_strength)
+ threshold = geo->ecc_strength;
+
+ /* Count bitflips */
+ for (i = 0; i < geo->ecc_chunk_size; i++) {
+ flip_bits += hweight8(~data[base + i]);
+ if (flip_bits > threshold)
+ return false;
+ }
+
+ /*
+ * Read out the whole page with ECC disabled, and check it again,
+ * This is more strict then just read out a chunk, and it makes
+ * the code more simple.
+ */
+ chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+ chip->read_buf(mtd, (uint8_t *)buf, mtd->writesize);
+
+ /* Count the bitflips for the no ECC buffer */
+ for (i = 0; i < mtd->writesize / 8; i++) {
+ flip_bits_noecc += hweight64(~buf[i]);
+ if (flip_bits_noecc > threshold)
+ return false;
+ }
+
+ /* Tell the upper layer the bitflips we corrected. */
+ mtd->ecc_stats.corrected += flip_bits;
+ *max_bitflips = max_t(unsigned int, *max_bitflips, flip_bits);
+
+ /*
+ * The geo->payload_size maybe not equal to the page size
+ * when the Subpage-Read is enabled.
+ */
+ memset(data, 0xff, geo->payload_size);
+
+ dev_dbg(this->dev, "The page(%d) is an erased page(%d,%d,%d,%d).\n",
+ page, chunk, threshold, flip_bits, flip_bits_noecc);
+ return true;
+}
+
static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
uint8_t *buf, int oob_required, int page)
{
@@ -1007,6 +1061,10 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
continue;
if (*status == STATUS_UNCORRECTABLE) {
+ if (gpmi_erased_check(this, payload_virt, i,
+ page, &max_bitflips))
+ break;
+
mtd->ecc_stats.failed++;
continue;
}
--
1.7.2.rc3
More information about the linux-mtd
mailing list