[PATCH] mtd: nand: omap: Fix BCH bit correction
Daniel Schultz
d.schultz at phytec.de
Fri Jun 9 01:17:55 PDT 2017
Hi Sascha,
Am 07.06.2017 um 08:53 schrieb Sascha Hauer:
> On Wed, Jun 07, 2017 at 08:49:09AM +0200, Sascha Hauer wrote:
>> On Wed, Jun 07, 2017 at 08:45:08AM +0200, Sascha Hauer wrote:
>>> +Cc Matt Reimer <mreimer at sdgsystems.com>
>>>
>>> On Tue, Jun 06, 2017 at 06:10:25PM +0200, Daniel Schultz wrote:
>>>> After commit dec7b4d2bf9 was applied our barebox only corrected the
>>>> first 512 Bytes of NAND pages.
>>>>
>>>> This patch separates between Hamming and BCH when finding out the
>>>> eccsteps, because BCH always works with 2kB pages.
>>>>
>>>> Before this patch:
>>>>
>>>> barebox at Phytec phyCORE AM335x:/ nand_bitflip -r -n 5 /dev/nand0.barebox
>>>> nand0.barebox: Flipping bit 5 @ 1796
>>>> nand0.barebox: Flipping bit 6 @ 1258
>>>> nand0.barebox: Flipping bit 5 @ 1062
>>>> nand0.barebox: Flipping bit 2 @ 1399
>>>> nand0.barebox: Flipping bit 6 @ 1243
>>>> No bitflips found on block 0, offset 0x00000000
>>>> barebox at Phytec phyCORE AM335x:/ nand_bitflip -r -n 5 /dev/nand0.barebox
>>>> nand0.barebox: Flipping bit 2 @ 872
>>>> nand0.barebox: Flipping bit 4 @ 252
>>>> nand0.barebox: Flipping bit 3 @ 568
>>>> nand0.barebox: Flipping bit 2 @ 247
>>>> nand0.barebox: Flipping bit 5 @ 401
>>>> page at block 0, offset 0x00000000 has 3 bitflips
>>>>
>>>> After this patch:
>>>>
>>>> barebox at Phytec phyCORE AM335x:/ nand_bitflip -r -n 5 /dev/nand0.barebox
>>>> nand0.barebox: Flipping bit 2 @ 1962
>>>> nand0.barebox: Flipping bit 0 @ 1563
>>>> nand0.barebox: Flipping bit 0 @ 1808
>>>> nand0.barebox: Flipping bit 6 @ 1460
>>>> nand0.barebox: Flipping bit 7 @ 2034
>>>> page at block 0, offset 0x00000000 has 5 bitflips
>>>> barebox at Phytec phyCORE AM335x:/ nand_bitflip -r -n 5 /dev/nand0.barebox
>>>> nand0.barebox: Flipping bit 1 @ 1352
>>>> nand0.barebox: Flipping bit 7 @ 1542
>>>> nand0.barebox: Flipping bit 2 @ 1021
>>>> nand0.barebox: Flipping bit 7 @ 691
>>>> nand0.barebox: Flipping bit 6 @ 1196
>>>> page at block 0, offset 0x00000000 has 10 bitflips, needs cleanup
>>>>
>>>> Signed-off-by: Daniel Schultz <d.schultz at phytec.de>
>>>> ---
>>>> drivers/mtd/nand/nand_omap_gpmc.c | 11 +++++++++--
>>>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/mtd/nand/nand_omap_gpmc.c b/drivers/mtd/nand/nand_omap_gpmc.c
>>>> index 05c8486..61220da 100644
>>>> --- a/drivers/mtd/nand/nand_omap_gpmc.c
>>>> +++ b/drivers/mtd/nand/nand_omap_gpmc.c
>>>> @@ -302,10 +302,17 @@ static int omap_correct_bch(struct mtd_info *mtd, uint8_t *dat,
>>>> unsigned int err_loc[8];
>>>> int bitflip_count;
>>>> int bch_max_err;
>>>> + int eccsteps;
>>>>
>>>> - int eccsteps = (nand->ecc.mode == NAND_ECC_HW) &&
>>>> - (nand->ecc.size == 2048) ? 4 : 1;
>>>> int eccsize = oinfo->nand.ecc.bytes;
>>>> + if (oinfo->ecc_mode == OMAP_ECC_HAMMING_CODE_HW_ROMCODE)
>>>
>>> This is wrong. When in Hamming ECC mode you shouldn't get into this
>>> function. The test should always fail.
That's why I added this test. I don't know why this change was made [1]
>>>
>>>> + if ((nand->ecc.mode == NAND_ECC_HW) &&
>>>> + (nand->ecc.size == 2048))
>>>> + eccsteps = 4;
>>>> + else
>>>> + eccsteps = 1;
>>>
>>> The question is why ecc.size is set to the wrong value in the first
>>> place:
>>>
>>> case OMAP_ECC_BCH8_CODE_HW:
>>> ...
>>> oinfo->nand.ecc.size = 512 * 4;
>>>
>>> This seems to be wrong. The BCH controller works in 512 Byte chunks, so
>>> ecc.size should be 512. This would make the special cases in
>>> omap_correct_bch() unnecessary.
Only OMAP_ECC_BCH8_CODE_HW_ROMCODE can call
omap_gpmc_read_page_bch_rom_mode(). So, this should be no problem, but
this multiplying is not in the kernel. Maybe this can affect older
systems (OMAP_ECC_BCH8_CODE_HW is only used by old phycards).
>>>
>>> In dec7b4d2bf9 Matt said:
>>>
>>> | The fix is to pull over a bit of code from the kernel's
>>> | omap_correct_data() that sets eccsteps = 4 when the page size is 2048
>>> | bytes and hardware ECC is being used.
>>>
>>> In fact, this piece is in the kernel code:
>>>
>>> /* Ex NAND_ECC_HW12_2048 */
>>> if ((info->nand.ecc.mode == NAND_ECC_HW) &&
>>> (info->nand.ecc.size == 2048))
>>> blockCnt = 4;
>>> else
>>> blockCnt = 1;
[1] since this is from the Hamming logic and not BCH.
This is a snippet from the linux-ti kernel:
case OMAP_ECC_HAM1_CODE_HW:
pr_info("nand: using OMAP_ECC_HAM1_CODE_HW\n");
nand_chip->ecc.mode = NAND_ECC_HW;
nand_chip->ecc.bytes = 3;
nand_chip->ecc.size = 512;
nand_chip->ecc.strength = 1;
nand_chip->ecc.calculate = omap_calculate_ecc;
nand_chip->ecc.hwctl = omap_enable_hwecc;
nand_chip->ecc.correct = omap_correct_data;
/* define ECC layout */
ecclayout->eccbytes = nand_chip->ecc.bytes
...
case OMAP_ECC_BCH8_CODE_HW:
nand_chip->ecc.mode = NAND_ECC_HW;
nand_chip->ecc.size = 512;
nand_chip->ecc.bytes = 13 + 1;
nand_chip->ecc.strength = 8;
nand_chip->ecc.hwctl = omap_enable_hwecc_bch;
nand_chip->ecc.correct = omap_elm_correct_data;
nand_chip->ecc.calculate =
omap_calculate_ecc_bch;
nand_chip->ecc.read_page = omap_read_page_bch;
nand_chip->ecc.write_page = omap_write_page_bch;
>>>
>>> I just suspect this is never used, because ecc.size is correctly set to 512 in
>>> all cases. Then ecc.steps results in 4 for 2k page sizes and the framework correctly
>>> iterates over the ecc steps.
>>>
>>> Please give the attached test a try. It's completely untested.
>>
>> And can not work. Additionally eccsteps must be set to 1 in
>> omap_correct_bch(). This effectively makes the loop in this function
>> unnecessary which can then removed.
>
> Which then means omap_gpmc_read_page_bch_rom_mode() has to iterate over
> ecc.steps itself, just like the other read_page implementations in the
> framework do.
>
So, the previous assignment of eccsteps was fine?
> So long, enough of self-replying for now ;)
>
> Sascha
>
--
Mit freundlichen Grüßen,
With best regards,
Daniel Schultz
More information about the barebox
mailing list