[PATCH v7] mtd: rawnand: use bit-wise majority to recover the contents of ONFI parameter

Chris Moore moore at free.fr
Wed May 16 21:28:28 PDT 2018


Hi,

Le 16/05/2018 à 09:56, Boris Brezillon a écrit :
> On Wed, 16 May 2018 09:32:57 +0200
> Chris Moore <moore at free.fr> wrote:
>
>> Hi,
>>
>> Le 15/05/2018 à 09:34, Boris Brezillon a écrit :
>>> On Tue, 15 May 2018 06:45:51 +0200
>>> Chris Moore <moore at free.fr> wrote:
>>>   
>>>> Hi,
>>>>
>>>> Le 13/05/2018 à 06:30, Wan, Jane (Nokia - US/Sunnyvale) a écrit :
>>>>> Per ONFI specification (Rev. 4.0), if all parameter pages have invalid CRC values, the bit-wise majority may be used to recover the contents of the parameter pages from the parameter page copies present.
>>>>>
>>>>> Signed-off-by: Jane Wan <Jane.Wan at nokia.com>
>>>>> ---
>>>>> v7: change debug print messages
>>>>> v6: support the cases that srcbufs are not contiguous
>>>>> v5: make the bit-wise majority functon generic
>>>>> v4: move the bit-wise majority code in a separate function
>>>>> v3: fix warning message detected by kbuild test robot
>>>>> v2: rebase the changes on top of v4.17-rc1
>>>>>     
>>>>> drivers/mtd/nand/raw/nand_base.c |   50 ++++++++++++++++++++++++++++++++++----
>>>>>     1 file changed, 45 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
>>>>> index 72f3a89..b43b784 100644
>>>>> --- a/drivers/mtd/nand/raw/nand_base.c
>>>>> +++ b/drivers/mtd/nand/raw/nand_base.c
>>>>> @@ -5087,6 +5087,35 @@ static int nand_flash_detect_ext_param_page(struct nand_chip *chip,
>>>>>     }
>>>>>     
>>>>>     /*
>>>>> + * Recover data with bit-wise majority
>>>>> + */
>>>>> +static void nand_bit_wise_majority(const void **srcbufs,
>>>>> +				   unsigned int nsrcbufs,
>>>>> +				   void *dstbuf,
>>>>> +				   unsigned int bufsize)
>>>>> +{
>>>>> +	int i, j, k;
>>>>> +
>>>>> +	for (i = 0; i < bufsize; i++) {
>>>>> +		u8 cnt, val;
>>>>> +
>>>>> +		val = 0;
>>>>> +		for (j = 0; j < 8; j++) {
>>>>> +			cnt = 0;
>>>>> +			for (k = 0; k < nsrcbufs; k++) {
>>>>> +				const u8 *srcbuf = srcbufs[k];
>>>>> +
>>>>> +				if (srcbuf[i] & BIT(j))
>>>>> +					cnt++;
>>>>> +			}
>>>>> +			if (cnt > nsrcbufs / 2)
>>>>> +				val |= BIT(j);
>>>>> +		}
>>>>> +		((u8 *)dstbuf)[i] = val;
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +/*
>>>>>      * Check if the NAND chip is ONFI compliant, returns 1 if it is, 0 otherwise.
>>>>>      */
>>>>>     static int nand_flash_detect_onfi(struct nand_chip *chip)
>>>>> @@ -5102,7 +5131,7 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
>>>>>     		return 0;
>>>>>     
>>>>>     	/* ONFI chip: allocate a buffer to hold its parameter page */
>>>>> -	p = kzalloc(sizeof(*p), GFP_KERNEL);
>>>>> +	p = kzalloc((sizeof(*p) * 3), GFP_KERNEL);
>>>>>     	if (!p)
>>>>>     		return -ENOMEM;
>>>>>     
>>>>> @@ -5113,21 +5142,32 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
>>>>>     	}
>>>>>     
>>>>>     	for (i = 0; i < 3; i++) {
>>>>> -		ret = nand_read_data_op(chip, p, sizeof(*p), true);
>>>>> +		ret = nand_read_data_op(chip, &p[i], sizeof(*p), true);
>>>>>     		if (ret) {
>>>>>     			ret = 0;
>>>>>     			goto free_onfi_param_page;
>>>>>     		}
>>>>>     
>>>>> -		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
>>>>> +		if (onfi_crc16(ONFI_CRC_BASE, (u8 *)&p[i], 254) ==
>>>>>     				le16_to_cpu(p->crc)) {
>>>>> +			if (i)
>>>>> +				memcpy(p, &p[i], sizeof(*p));
>>>>>     			break;
>>>>>     		}
>>>>>     	}
>>>>>     
>>>>>     	if (i == 3) {
>>>>> -		pr_err("Could not find valid ONFI parameter page; aborting\n");
>>>>> -		goto free_onfi_param_page;
>>>>> +		const void *srcbufs[3] = {p, p + 1, p + 2};
>>>>> +
>>>>> +		pr_warn("Could not find a valid ONFI parameter page, trying bit-wise majority to recover it\n");
>>>>> +		nand_bit_wise_majority(srcbufs, ARRAY_SIZE(srcbufs), p,
>>>>> +				       sizeof(*p));
>>>>> +
>>>>> +		if (onfi_crc16(ONFI_CRC_BASE, (u8 *)p, 254) !=
>>>>> +				le16_to_cpu(p->crc)) {
>>>>> +			pr_err("ONFI parameter recovery failed, aborting\n");
>>>>> +			goto free_onfi_param_page;
>>>>> +		}
>>>>>     	}
>>>>>     
>>>>>     	/* Check version */
>>>> This version is still hard coded for a three sample bitwise majority vote.
>>>> So why not use the method which I suggested previously for v2 and which
>>>> I repeat below?
>>> Because I want the nand_bit_wise_majority() function to work with
>>> nsrcbufs > 3 (the ONFI spec says there's at least 3 copy of the param
>>> page, but NAND vendor can decide to put more). Also, if the X copies of
>>> the PARAM are corrupted (which is rather unlikely), that means we
>>> already spent quite a lot of time reading the different copies and
>>> calculating the CRC, so I think we don't care about perf optimizations
>>> when doing bit-wise majority.
>>>   
>>>> The three sample bitwise majority can be implemented without bit level
>>>> manipulation using the identity:
>>>> majority3(a, b, c) = (a & b) | (a & c) | (b & c)
>>>> This can be factorized slightly to (a & (b | c)) | (b & c)
>>>> This enables the operation to be performed 8, 16, 32 or even 64 bits at
>>>> a time depending on the hardware.
>>>>
>>>> This method is not only faster and but also more compact.
>>>>   
>> I do understand that the ONFI specifications permit more than 3 copies.
>> However elsewhere the proposed code shows no intention of handling other
>> cases.
>> The constant 3 is hard coded in the following lines extracted from the
>> proposed code:
>> ...
>> +    p = kzalloc((sizeof(*p) * 3), GFP_KERNEL);
>> ...
>>        for (i = 0; i < 3; i++) {
>> ...
>>        if (i == 3) {
>> ...
>> +        const void *srcbufs[3] = {p, p + 1, p + 2};
>>
>> Moreover the last of these is difficult to generalize.
> Not that much. We just have to allocate srcbufs dynamically and
> krealloc() everytime we want to add a new entry.
>
> May I ask why you care that much about this optimization? As I said, if
> we really have to read all the copies to realize none of them is good,
> we already lost a lot of time, so having a "suboptimal but generic"
> version of the bit-wise majority shouldn't hurt that much.
>

I just wanted to point out that, as the proposed code only handles the 
case of 3 copies, there is a much more efficient way of performing the 
majority vote than looping over each bit.
My method would also reduce the code size which is an important factor 
in the kernel.
Otherwise I have no objection and I shall accept your choice without 
further remarks on this point.

Cheers,
Chris




More information about the linux-mtd mailing list