nand: WARNING: a0000000.nand: the ECC used on your system (1b/256B) is too weak compared to the one required by the NAND chip (4b/512B)

Christophe Leroy christophe.leroy at csgroup.eu
Fri Jun 18 07:18:11 PDT 2021



Le 18/06/2021 à 08:43, Miquel Raynal a écrit :
> Hi Christophe,
> 
> Christophe Leroy <christophe.leroy at csgroup.eu> wrote on Thu, 17 Jun
> 2021 19:17:05 +0200:
> 
>> Hello Miquel,
>>
>> I have a board running latest kernel with the following NAND:
>>
>> [    1.523076] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
>> [    1.529505] nand: Micron MT29F2G08ABAEAWP
>> [    1.533526] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB siz
>> e: 64
>> [    1.541196] nand: WARNING: a0000000.nand: the ECC used on your system (1b/256
>> B) is too weak compared to the one required by the NAND chip (4b/512B)
>>
>> Until now I was using kernel 4.14 and I was having no problem, allthough it was also exhibiting the following (less detailed) warning
> 
> Yes, I decided to give more info of what is the minimum ECC scheme that
> should be used and what is the one being applied.

Yes it was a good idea.

> 
>> [    0.591009] nand: WARNING: a0000000.nand: the ECC used on your system is too weak compared to the one required by the NAND chip
>>
>> Now and then I'm using one of the latest kernels (Today is 5.13-rc6), and sometime in one of the 5.x releases, I started to get errors like:
>>
>> [    5.098265] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.103859] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 60
>>    bytes from PEB 99:59824, read only 60 bytes, retry
>> [    5.525843] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.531571] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.537490] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
>> 73 bytes from PEB 107:108976, read only 3073 bytes, retry
>> [    5.691121] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.696709] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.702426] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.708141] ecc_sw_hamming_correct: uncorrectable ECC error
>> [    5.714103] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
>> 35 bytes from PEB 107:25144, read only 3035 bytes, retry
>> [   20.523689] random: crng init done
>> [   21.892130] ecc_sw_hamming_correct: uncorrectable ECC error
>> [   21.897730] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 13
>> 94 bytes from PEB 116:75776, read only 1394 bytes, retry
>>
>> Most of the time, when the reading of the file fails, I just have to read it once more and it gets read without that error.
> 
> It really looks like a regular bitflip happening "sometimes". Is this a
> board which already had a life? What are the usage counters (UBI should
> tell you this) compared to the official endurance of your chip (see the
> datasheet)?

The board had a peacefull life:

UBI reports "ubi0: max/mean erase counter: 49/20, WL threshold: 4096"

I have tried with half a dozen of boards and all have the issue.

> 
>> What am I supposed to do to avoid the ECC weakness warning at startup and to fix that ECC error issue ?
> 
> I honestly don't think the errors come from the 5.1x kernels given the
> above logs. If you flash back your old 4.14 I am pretty sure you'll
> have the same errors at some point.

I don't have any problem like that with 4.14 with any of the board.

When booting a 4.14 kernel I don't get any problem on the same board.

> 
> NAND really is a fragile storage medium, not following in a production
> environment the minimum ECC scheme (there is a real difference between
> 1/256 vs 4/512) really leads to complicated solutions like this one,
> unfortunately...

I see kernel has "Software BCH ECC". Should I use that with that chip ?

If yes, how do I use it ? Seems like selecting the option at Kernel build is not enough, do I have 
to configure something somewhere, for instance in the device tree ? At the time being I have the 
following in the device tree:

		nand at 2,0 {
			compatible = "gpio-control-nand";
			reg = <2 0x0000 0x1>;
			#address-cells = <1>;
			#size-cells = <1>;
			rdy-gpio = <&cpld_etat 13 0>;	/* RDY */
			nce-gpio = <&CPM1_PIO_D 15 0>;	/* nCE */
			ale-gpio = <&CPM1_PIO_D 13 0>;	/* ALE */
			cle-gpio = <&CPM1_PIO_D 12 0>;	/* CLE */
		};


Thanks
Christophe



More information about the linux-mtd mailing list