[PATCH 2/2] mtd: brcmnand: Detect sticky ucorr ecc error on dma reads

Kamal Dasu kamal.dasu at broadcom.com
Wed Jun 1 09:50:56 PDT 2016


Boris,

On Mon, May 30, 2016 at 4:50 AM, Boris Brezillon
<boris.brezillon at free-electrons.com> wrote:
> On Fri, 29 Apr 2016 16:21:25 -0400
> Kamal Dasu <kdasu.kdev at gmail.com> wrote:
>
>> This change provides a fix for controller bug where nand
>> controller could have a possible sticky error after a PIO
>> followed by a DMA read. The fix retries a read if we see
>> a uncorr_ecc after read to detect such sticky errors.
>>
>> Signed-off-by: Kamal Dasu <kdasu.kdev at gmail.com>
>> ---
>>  drivers/mtd/nand/brcmnand/brcmnand.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mtd/nand/brcmnand/brcmnand.c b/drivers/mtd/nand/brcmnand/brcmnand.c
>> index 29a9abd..13c7784 100644
>> --- a/drivers/mtd/nand/brcmnand/brcmnand.c
>> +++ b/drivers/mtd/nand/brcmnand/brcmnand.c
>> @@ -1555,9 +1555,11 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
>>       struct brcmnand_controller *ctrl = host->ctrl;
>>       u64 err_addr = 0;
>>       int err;
>> +     bool retry = true;
>>
>>       dev_dbg(ctrl->dev, "read %llx -> %p\n", (unsigned long long)addr, buf);
>>
>> +try_dmaread:
>>       brcmnand_write_reg(ctrl, BRCMNAND_UNCORR_COUNT, 0);
>>
>>       if (has_flash_dma(ctrl) && !oob && flash_dma_buf_ok(buf)) {
>> @@ -1579,7 +1581,18 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
>>
>>       if (mtd_is_eccerr(err)) {
>>               int ret;
>> -
>> +             /*
>> +              * On controller version >=7.0 if we are doing a DMA read
>> +              * after a prior PIO read that reported uncorrectable error,
>> +              * the DMA engine captures this error following DMA read
>> +              * cleared only on subsequent DMA read, so just retry once
>> +              * to clear a possible false error reported for current DMA
>> +              * read
>> +              */
>
> Hm, shouldn't this BRCMNAND_UNCORR_COUNT bit be cleared just after
> doing the PIO/DMA read instead of doing it before executing a new read?
> This would solve your problem without the need for this extra retry, or
> am I missing something?
>

Clearing the count registers or the intr registers does not clear the
condition. Only a clean read (a page that does not have errors) clears
the condition. So if this was a false error  ( page is really clean)
and we read again, it will clear the condition.



Thanks
Kamal



More information about the linux-mtd mailing list