[PATCH 3/9] mtd: nand: qcom: erased page detection for uncorrectable errors only

Abhishek Sahu absahu at codeaurora.org
Wed Apr 11 23:58:05 PDT 2018


On 2018-04-12 12:19, Miquel Raynal wrote:
> Hi Abhishek,
> 
> On Thu, 12 Apr 2018 12:03:58 +0530, Abhishek Sahu
> <absahu at codeaurora.org> wrote:
> 
>> On 2018-04-10 14:29, Miquel Raynal wrote:
>> > Hi Abhishek,
>> > > On Wed,  4 Apr 2018 18:12:19 +0530, Abhishek Sahu
>> > <absahu at codeaurora.org> wrote:
>> > >> The NAND flash controller generates ECC uncorrectable error
>> >> first in case of completely erased page. Currently driver
>> >> applies the erased page detection logic for other operation
>> >> errors also so fix this and return EIO for other operational
>> >> errors.
>> > > I am sorry I don't understand very well what is the purpose of this
>> > patch, could you please explain it again?
>> > > Do you mean that you want to avoid having rising ECC errors when you
>> > read erased pages?
>> >   Thanks Miquel for your review.
>> 
>>   QCOM NAND flash controller has in built erased page
>>   detection HW.
>>   Following is the flow in the HW if controller tries
>>   to read erased page
>> 
>>   1. First ECC uncorrectable error will be generated from
>>      ECC engine since ECC engine first calculates the ECC with
>>      all 0xff and match the calculated ECC with ECC code in OOB
>>      (which is again all 0xff).
>>   2. After getting ECC error, erased CW detection HW checks if
>>      all the bytes in page are 0xff and then it updates the
>>      status in separate register NAND_ERASED_CW_DETECT_STATUS
>> 
>>   So the erased CW detect status should be checked only if
>>   ECC engine generated the uncorrectable error.
>> 
>>   Currently for all other operational errors also (like TIMEOUT,
>>   MPU errors etc), the erased CW detect register was being
>>   checked.
> 
> This is very clear, thanks. I don't know very much this controller so I
> think you can add this information in the commit message for future
> reference.
> 

  Sure Miquel.
  I  Will update the commit message to include more detail.

>> 
>> >> >> Signed-off-by: Abhishek Sahu <absahu at codeaurora.org>
>> >> ---
>> >>  drivers/mtd/nand/qcom_nandc.c | 8 +++++++-
>> >>  1 file changed, 7 insertions(+), 1 deletion(-)
>> >> >> diff --git a/drivers/mtd/nand/qcom_nandc.c >> b/drivers/mtd/nand/qcom_nandc.c
>> >> index 17321fc..57c16a6 100644
>> >> --- a/drivers/mtd/nand/qcom_nandc.c
>> >> +++ b/drivers/mtd/nand/qcom_nandc.c
>> >> @@ -1578,6 +1578,7 @@ static int parse_read_errors(struct >> qcom_nand_host *host, u8 *data_buf,
>> >>  	struct nand_ecc_ctrl *ecc = &chip->ecc;
>> >>  	unsigned int max_bitflips = 0;
>> >>  	struct read_stats *buf;
>> >> +	bool flash_op_err = false;
>> >>  	int i;
>> >> >>  	buf = (struct read_stats *)nandc->reg_read_buf;
>> >> @@ -1599,7 +1600,7 @@ static int parse_read_errors(struct >> qcom_nand_host *host, u8 *data_buf,
>> >>  		buffer = le32_to_cpu(buf->buffer);
>> >>  		erased_cw = le32_to_cpu(buf->erased_cw);
>> >> >> -		if (flash & (FS_OP_ERR | FS_MPU_ERR)) {
>> >> +		if ((flash & FS_OP_ERR) && (buffer & BS_UNCORRECTABLE_BIT)) {
>> > > And later you have another "if (buffer & BS_UNCORRECTABLE_BIT)" which
>> > is then redundant, unless that is not what you actually want to do?
>> 
>>   Yes. That check seems to be redundant. I will fix that.
>> 
>> > > Maybe you can add comments before the if ()/ else if () to explain in
>> > which case you enter each branch.
>> 
>>   Sure. That would be better. Will add the same in next patch set.
>> 
>> > >>  			bool erased;
>> >> >>  			/* ignore erased codeword errors */
>> >> @@ -1641,6 +1642,8 @@ static int parse_read_errors(struct >> qcom_nand_host *host, u8 *data_buf,
>> >>  						max_t(unsigned int, max_bitflips, ret);
>> >>  				}
>> >>  			}
>> >> +		} else if (flash & (FS_OP_ERR | FS_MPU_ERR)) {
>> >> +			flash_op_err = true;
>> >>  		} else {
>> >>  			unsigned int stat;
>> >> >> @@ -1654,6 +1657,9 @@ static int parse_read_errors(struct >> qcom_nand_host *host, u8 *data_buf,
>> >>  			oob_buf += oob_len + ecc->bytes;
>> >>  	}
>> >> >> +	if (flash_op_err)
>> >> +		return -EIO;
>> >> +
>> > > In you are propagating an error related to the controller, this is
>> > fine, but I think you just want to raise the fact that a NAND
>> > uncorrectable error occurred, in this case you should just increment
>> > mtd->ecc_stats.failed and return 0 (returning max_bitflips here would > be
>> > fine too has it would be 0 too).
>> 
>>    The flash_op_err will be for other operational errors only (like 
>> timeout,
>>    MPU error, device failure etc). For correctable errors,
>> 
>>    ret = nand_check_erased_ecc_chunk(data_buf,
>>                            data_len, eccbuf, ecclen, oob_buf,
>>                            extraooblen, ecc->strength);
> 
> Why do you need nand_check_erased_ecc_chunk() if the blank page check
> is done in hw?
> 

  This is only applicable for BCH algorithm.
  IPQ806x uses RS code for 4 bit ECC which does not have HW blank page
  detection.

  You can get more detail in function comment of
  erased_chunk_check_and_fixup

   /*
   * when using BCH ECC, the HW flags an error in NAND_FLASH_STATUS if it 
read
   * an erased CW, and reports an erased CW in 
NAND_ERASED_CW_DETECT_STATUS.
   *
   * when using RS ECC, the HW reports the same erros when reading an 
erased CW,
   * but it notifies that it is an erased CW by placing special 
characters at
   * certain offsets in the buffer.
   *
   * verify if the page is erased or not, and fix up the page for RS ECC 
by
   * replacing the special characters with 0xff.
   */
  static bool erased_chunk_check_and_fixup(u8 *data_buf, int data_len)
  {

  Thanks,
  Abhishek




More information about the linux-mtd mailing list