[PATCH v9 3/5] mtd: nand: omap: ecc.correct: omap_elm_correct_data: fix erased-page detection for BCHx_HW ECC schemes
Brian Norris
computersforpeace at gmail.com
Thu Mar 20 05:27:46 EDT 2014
Hi Pekon,
Thanks for reworking this. I see one issue, but it's not actually a new
issue with your patch. Maybe it deserves a follow-up patch?
On Tue, Mar 18, 2014 at 06:56:44PM +0530, Pekon Gupta wrote:
> As erased-pages do not have ECC stored in their OOB area, so they need to be
> seperated out from programmed-pages, before doing BCH ECC correction.
>
> In current implementation of omap_elm_correct_data() which does ECC correction
> for BCHx ECC schemes, this erased-pages are detected based on specific marker
> byte (reserved as 0x00) in ecc-layout.
> However, this approach has some limitation like;
> 1) All ecc-scheme layouts do not have such Reserved byte marker to
> differentiate between erased-page v/s programmed-page. Thus this is a
> customized solution.
> 2) Reserved marker byte can itself be subjected to bit-flips causing
> erased-page to be misunderstood as programmed-page.
>
> This patch removes dependency on any marker byte in ecc-layout, instead it
> compares calc_ecc[] with pattern of ECC-of-all(0xff). This implicitely
> means that both 'data + oob == all(0xff).
>
> Signed-off-by: Pekon Gupta <pekon at ti.com>
> ---
> drivers/mtd/nand/omap2.c | 85 +++++++++++++++++++-----------------------------
> 1 file changed, 33 insertions(+), 52 deletions(-)
>
> diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
> index 4bf2f76d..a6c979f 100644
> --- a/drivers/mtd/nand/omap2.c
> +++ b/drivers/mtd/nand/omap2.c
> @@ -1338,21 +1338,8 @@ static int erased_sector_bitflips(u_char *data, u_char *oob,
> * @calc_ecc: ecc read from HW ECC registers
> *
> * Calculated ecc vector reported as zero in case of non-error pages.
> - * In case of error/erased pages non-zero error vector is reported.
> - * In case of non-zero ecc vector, check read_ecc at fixed offset
> - * (x = 13/7 in case of BCH8/4 == 0) to find page programmed or not.
> - * To handle bit flips in this data, count the number of 0's in
> - * read_ecc[x] and check if it greater than 4. If it is less, it is
> - * programmed page, else erased page.
> - *
> - * 1. If page is erased, check with standard ecc vector (ecc vector
> - * for erased page to find any bit flip). If check fails, bit flip
> - * is present in erased page. Count the bit flips in erased page and
> - * if it falls under correctable level, report page with 0xFF and
> - * update the correctable bit information.
> - * 2. If error is reported on programmed page, update elm error
> - * vector and correct the page with ELM error correction routine.
> - *
> + * In case of non-zero ecc vector, first filter out erased-pages, and
> + * then process data via ELM to detect bit-flips.
> */
> static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
> u_char *read_ecc, u_char *calc_ecc)
> @@ -1367,6 +1354,8 @@ static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
> u_char *ecc_vec = calc_ecc;
> u_char *spare_ecc = read_ecc;
> u_char *erased_ecc_vec;
> + u_char *buf;
> + int bitflip_count;
> enum bch_ecc type;
> bool is_error_reported = false;
>
> @@ -1374,10 +1363,12 @@ static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
> case OMAP_ECC_BCH4_CODE_HW:
> /* omit 7th ECC byte reserved for ROM code compatibility */
> actual_eccbytes = ecc->bytes - 1;
> + erased_ecc_vec = bch4_vector;
> break;
> case OMAP_ECC_BCH8_CODE_HW:
> /* omit 14th ECC byte reserved for ROM code compatibility */
> actual_eccbytes = ecc->bytes - 1;
> + erased_ecc_vec = bch8_vector;
> break;
> default:
> pr_err("invalid driver configuration\n");
> @@ -1389,10 +1380,8 @@ static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
>
> if (info->nand.ecc.strength == BCH8_MAX_ERROR) {
> type = BCH8_ECC;
> - erased_ecc_vec = bch8_vector;
> } else {
> type = BCH4_ECC;
> - erased_ecc_vec = bch4_vector;
> }
>
> for (i = 0; i < eccsteps ; i++) {
> @@ -1410,44 +1399,36 @@ static int omap_elm_correct_data(struct mtd_info *mtd, u_char *data,
> }
>
> if (eccflag == 1) {
> - /*
> - * Set threshold to minimum of 4, half of ecc.strength/2
> - * to allow max bit flip in byte to 4
> - */
> - unsigned int threshold = min_t(unsigned int, 4,
> - info->nand.ecc.strength / 2);
> -
> - /*
> - * Check data area is programmed by counting
> - * number of 0's at fixed offset in spare area.
> - * Checking count of 0's against threshold.
> - * In case programmed page expects at least threshold
> - * zeros in byte.
> - * If zeros are less than threshold for programmed page/
> - * zeros are more than threshold erased page, either
> - * case page reported as uncorrectable.
> - */
> - if (hweight8(~read_ecc[actual_eccbytes]) >= threshold) {
> + if (memcmp(calc_ecc, erased_ecc_vec,
> + actual_eccbytes) == 0) {
> /*
> - * Update elm error vector as
> - * data area is programmed
> + * calc_ecc[] matches pattern for ECC(all 0xff)
> + * so this is definitely an erased-page
> */
> - err_vec[i].error_reported = true;
> - is_error_reported = true;
> } else {
> - /* Error reported in erased page */
> - int bitflip_count;
> - u_char *buf = &data[info->nand.ecc.size * i];
> -
> - if (memcmp(calc_ecc, erased_ecc_vec,
> - actual_eccbytes)) {
> - bitflip_count = erased_sector_bitflips(
> - buf, read_ecc, info);
> -
> - if (bitflip_count)
> - stat += bitflip_count;
> - else
> - return -EINVAL;
> + buf = &data[info->nand.ecc.size * i];
> + /*
> + * count number of 0-bits in read_buf.
> + * This check can be removed once a similar
> + * check is introduced in generic NAND driver
> + */
> + bitflip_count = erased_sector_bitflips(
> + buf, read_ecc, info);
> + if (bitflip_count) {
> + /*
> + * number of 0-bits within ECC limits
> + * So this may be an erased-page
> + */
> + stat += bitflip_count;
I see we update 'stat', but in the case that this is a correctable
erased page, don't we just ignore it? i.e., you'll hit this case below
(not shown here):
/* Check if any error reported */
if (!is_error_reported)
return 0;
Perhaps this should be changed to:
if (!is_error_reported)
return stat;
?
> + } else {
> + /*
> + * Too many 0-bits. It may be a
> + * - programmed-page, OR
> + * - erased-page with many bit-flips
> + * So this page requires check by ELM
> + */
> + err_vec[i].error_reported = true;
> + is_error_reported = true;
> }
> }
> }
Brian
More information about the linux-mtd
mailing list