[PATCH 2/2] Use bit-wise majority to recover the contents of ONFI parameter

Wan, Jane (Nokia - US/Sunnyvale) jane.wan at nokia.com
Mon Apr 30 22:33:59 PDT 2018


Hi Miquèl,

Thank you for your response and feedback.  I've modified the fix based on your comments.  
Please see the updated patch file at the end of this message (also in attachment).
My answers to your comments/questions are inline in the previous message.

The new patch is rebased on top of v4.17-rc1.

Best regards,
Jane

Updated patch:
From e14ed7dc08296a52f81d14781dee2f455dd90bbd Mon Sep 17 00:00:00 2001
From: Jane Wan <Jane.Wan at nokia.com>
Date: Mon, 30 Apr 2018 14:05:40 -0700
Subject: [PATCH 2/2] mtd: rawnand: fsl_ifc: use bit-wise majority to recover
 the contents of ONFI parameter

Per ONFI specification (Rev. 4.0), if all parameter pages have invalid
CRC values, the bit-wise majority may be used to recover the contents of
the parameter pages from the parameter page copies present.

Signed-off-by: Jane Wan <Jane.Wan at nokia.com>
---
 drivers/mtd/nand/raw/nand_base.c |   36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 72f3a89..464c4fb 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -5086,6 +5086,8 @@ static int nand_flash_detect_ext_param_page(struct nand_chip *chip,
 	return ret;
 }
 
+#define GET_BIT(bit, val)   (((val) >> (bit)) & 0x01)
+
 /*
  * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise.
  */
@@ -5094,7 +5096,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	struct nand_onfi_params *p;
 	char id[4];
-	int i, ret, val;
+	int i, ret, val, pagesize;
+	u8 *buf;
 
 	/* Try ONFI for unknown chip or LP */
 	ret = nand_readid_op(chip, 0x20, id, sizeof(id));
@@ -5102,8 +5105,9 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
 		return 0;
 
 	/* ONFI chip: allocate a buffer to hold its parameter page */
-	p = kzalloc(sizeof(*p), GFP_KERNEL);
-	if (!p)
+	pagesize = sizeof(*p);
+	buf = kzalloc((pagesize * 3), GFP_KERNEL);
+	if (!buf)
 		return -ENOMEM;
 
 	ret = nand_read_param_page_op(chip, 0, NULL, 0);
@@ -5113,7 +5117,8 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
 	}
 
 	for (i = 0; i < 3; i++) {
-		ret = nand_read_data_op(chip, p, sizeof(*p), true);
+		p = (struct nand_onfi_params *)&buf[i*pagesize];
+		ret = nand_read_data_op(chip, p, pagesize, true);
 		if (ret) {
 			ret = 0;
 			goto free_onfi_param_page;
@@ -5126,8 +5131,27 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
 	}
 
 	if (i == 3) {
-		pr_err("Could not find valid ONFI parameter page; aborting\n");
-		goto free_onfi_param_page;
+		int j, k, l;
+		u8 v, m;
+
+		pr_err("Could not find valid ONFI parameter page\n");
+		pr_info("Recover ONFI params with bit-wise majority\n");
+		for (j = 0; j < pagesize; j++) {
+			v = 0;
+			for (k = 0; k < 8; k++) {
+				m = 0;
+				for (l = 0; l < 3; l++)
+					m += GET_BIT(k, buf[l*pagesize + j]);
+				if (m > 1)
+					v |= BIT(k);
+			}
+			((u8 *)p)[j] = v;
+		}
+		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) !=
+				le16_to_cpu(p->crc)) {
+			pr_err("ONFI parameter recovery failed, aborting\n");
+			goto free_onfi_param_page;
+		}
 	}
 
 	/* Check version */
-- 
1.7.9.5

-----Original Message-----
From: Miquel Raynal [mailto:miquel.raynal at bootlin.com] 
Sent: Saturday, April 28, 2018 5:07 AM
To: Wan, Jane (Nokia - US/Sunnyvale) <jane.wan at nokia.com>
Cc: dwmw2 at infradead.org; computersforpeace at gmail.com; Bos, Ties (Nokia - US/Sunnyvale) <ties.bos at nokia.com>; linux-mtd at lists.infradead.org; linux-kernel at vger.kernel.org; Boris Brezillon <Boris.Brezillon at bootlin.com>
Subject: Re: [PATCH 2/2] Use bit-wise majority to recover the contents of ONFI parameter

Hi Jane,

Same comments as before, please: get the right maintainers, add a commit log, rebase and fix the title prefix.
[Jane]  Added.  Thanks.

Have you ever needed/tried this algorithm before?
[Jane] Yes, we got a batch of particularly bad NAND chips recently and we needed these changes to make them work reliably over temperature.  The patch was verified using these bad chips.

On Thu, 26 Apr 2018 17:19:56 -0700, Jane Wan <Jane.Wan at nokia.com> wrote:

> Signed-off-by: Jane Wan <Jane.Wan at nokia.com>
> ---
>  drivers/mtd/nand/nand_base.c |   35 +++++++++++++++++++++++++++++++----
>  1 file changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mtd/nand/nand_base.c 
> b/drivers/mtd/nand/nand_base.c index c2e1232..161b523 100644
> --- a/drivers/mtd/nand/nand_base.c
> +++ b/drivers/mtd/nand/nand_base.c
> @@ -3153,8 +3153,10 @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
>  					int *busw)
>  {
>  	struct nand_onfi_params *p = &chip->onfi_params;
> -	int i, j;
> -	int val;
> +	int i, j, k, len, val;
> +	uint8_t copy[3][256], v8;

Please use u8 instead of uint8_t (./scripts/checkpatch.pl --strict will give you the list of styling issues to fix.
[Jane] Changed.

I don't think you should allocate that much space on the stack, please use dynamic allocation.
[Jane] Please see new patch file.

> +
> +	len = (sizeof(*p) > 256) ? 256 : sizeof(*p);

This is a maximum derivation, there are helpers for that.
[Jane] Not needed after changing to dynamic allocation.

I am not sure this is relevant, won't you read only 256 bytes anyway?
[Jane] I modified the code to allocate a buffer to store all 3 pages.  If in case all 3 pages failed, use the data stored
In the buffer to recover the content through bit-wise majority.

>  
>  	/* Try ONFI for unknown chip or LP */
>  	chip->cmdfunc(mtd, NAND_CMD_READID, 0x20, -1); @@ -3170,11 +3172,36 
> @@ static int nand_flash_detect_onfi(struct mtd_info *mtd, struct nand_chip *chip,
>  				le16_to_cpu(p->crc)) {
>  			break;
>  		}

Space.
[Jane] checked the patch through checkpatch.pl.

> +		pr_err("CRC of parameter page %d is not valid\n", i);
> +		for (j = 0; j < len; j++)
> +			copy[i][j] = ((uint8_t *)p)[j];

'copy' is maybe not a meaningful name.
[Jane] 'copy' is removed.  Please see new patch file.

>  	}
>  
>  	if (i == 3) {
> -		pr_err("Could not find valid ONFI parameter page; aborting\n");
> -		return 0;
> +		pr_err("Could not find valid ONFI parameter page\n");
> +		pr_info("Recover ONFI parameters with bit-wise majority\n");
> +		for (j = 0; j < len; j++) {
> +			if (copy[0][j] == copy[1][j] ||
> +			    copy[0][j] == copy[2][j]) {
> +				((uint8_t *)p)[j] = copy[0][j];
> +			} else if (copy[1][j] == copy[2][j]) {
> +				((uint8_t *)p)[j] = copy[1][j];
> +			} else {
> +				((uint8_t *)p)[j] = 0;
> +				for (k = 0; k < 8; k++) {
> +					v8 = (copy[0][j] >> k) & 0x1;

v8 could be declared in the else statement of in the for loop.
The name could also be changed.
[Jane] Changed.  Please see the new patch file.

> +					v8 += (copy[1][j] >> k) & 0x1;
> +					v8 += (copy[2][j] >> k) & 0x1;
> +					if (v8 > 1)
> +						((uint8_t *)p)[j] |= (1 << k);

Please use the BIT() macro.
[Jane] Changed.  Please see the new patch file.

> +				}
> +			}
> +		}

Space.
[Jane] checked the patch through checkpatch.pl.

> +		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) !=
> +		    le16_to_cpu(p->crc)) {
> +			pr_err("ONFI parameter recovery failed, aborting\n");
> +			return 0;
> +		}
>  	}
>  
>  	/* Check version */

Thanks,
Miquèl

--
Miquel Raynal, Bootlin (formerly Free Electrons) Embedded Linux and Kernel engineering https://bootlin.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-mtd-rawnand-fsl_ifc-use-bit-wise-majority-to-recover.patch
Type: application/octet-stream
Size: 2891 bytes
Desc: 0002-mtd-rawnand-fsl_ifc-use-bit-wise-majority-to-recover.patch
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20180501/54467e65/attachment.obj>


More information about the linux-mtd mailing list