[PATCH v9 2/5] mtd: nand: vf610_nfc: add hardware BCH-ECC support
Stefan Agner
stefan at agner.ch
Fri Jul 31 16:35:52 PDT 2015
Hi Brian,
On 2015-08-01 01:09, Brian Norris wrote:
<snip>
>> +static inline int vf610_nfc_correct_data(struct mtd_info *mtd, uint8_t *dat)
>> +{
>> + struct vf610_nfc *nfc = mtd_to_nfc(mtd);
>> + u8 ecc_status;
>> + u8 ecc_count;
>> + int flip;
>> +
>> + ecc_status = __raw_readb(nfc->regs + ECC_SRAM_ADDR * 8 + ECC_OFFSET);
>> + ecc_count = ecc_status & ECC_ERR_COUNT;
>> + if (!(ecc_status & ECC_STATUS_MASK))
>> + return ecc_count;
>> +
>> + /*
>> + * On an erased page, bit count should be zero or at least
>> + * less then half of the ECC strength
>> + */
>> + flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
>
> Sorry I didn't notice this earlier, but it appears you are falling into
> the same trap that almost everyone else is -- it is not sufficient to
> check just the page area; you also need to check the OOB. Suppose that
> a MTD user wrote mostly-0xff data to the page, then the page accumulates
> bitflips in the spare area and a few in the page area, such that
> eventually HW ECC can't correct them. If there are few enough zero bits
> in the data area, you will mistakenly think that this is a blank page
> below, and memset() it to 0xff. That would be disastrous!
>
> Fortunately, your code is otherwise quite well structured and looks
> good. A tip below.
>
>> +
>> + if (flip > ecc_count && flip > (nfc->chip.ecc.strength / 2))
>> + return -1;
>> +
>> + /* Erased page. */
>> + memset(dat, 0xff, nfc->chip.ecc.size);
>> + return 0;
>> +}
>> +
>> +static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
>> + uint8_t *buf, int oob_required, int page)
>> +{
>> + int eccsize = chip->ecc.size;
>> + int stat;
>> +
>> + vf610_nfc_read_buf(mtd, buf, eccsize);
>> +
>> + if (oob_required)
>> + vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
>
> To fix the bitflips issue above, you'll just want to unconditionally
> read the OOB (it's fine to ignore 'oob_required') and...
>
>> +
>> + stat = vf610_nfc_correct_data(mtd, buf);
>
> ...pass in chip->oob_poi as a third argument.
>
Hm, this probably will have an effect on performance, since we usually
omit the OOB if not requested. I could fetch the OOB from the NAND
controllers SRAM only if necessary (if HW ECC status is not ok...). Does
this sound reasonable?
>> +
>> + if (stat < 0)
>> + mtd->ecc_stats.failed++;
>> + else
>> + mtd->ecc_stats.corrected += stat;
>
> You've got another problem here: ecc.read_page() should be returning
> 'max_bitflips' here. So, since you have a single ECC region, this block
> should probably be:
>
> if (stat < 0) {
> mtd->ecc_stats.failed++;
> return 0;
> } else {
> mtd->ecc_stats.corrected += stat;
> return stat;
> }
>
Ok, will change that.
--
Stefan
More information about the linux-arm-kernel
mailing list