[PATCH v4 0/2] mtd: hisilicon: add a new driver for NAND controller of hisilicon hip04 Soc

Brian Norris computersforpeace at gmail.com
Thu Jan 22 00:45:29 PST 2015


Hi Zhou,

On Thu, Jan 22, 2015 at 11:27:01AM +0800, Zhou Wang wrote:
> Very sorry for late, I made tests again and also had a talk with the
> NAND controller hardware colleague. Please find my reply below.

No problem. Glad to hear you followed through on this one, as the
results were curious.

> On 2015/1/13 12:17, Brian Norris wrote:
> > On Wed, Dec 17, 2014 at 07:05:47PM +0800, Zhou Wang wrote:
> >> On 2014年12月17日 14:23, Brian Norris wrote:
> > [...]
> >>>> [  104.648056] mtd_nandbiterrs: ECC failure, read data is incorrect
> >>>> despite read success
> >>>> insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error
[...]
> I made testes again in 1bit/ECC and 16bit/ECC modes using 2K(page)+64B(oob)
> NAND flash. here are the logs, I also printed ECC code in OOB area.
> 
> Results are:
> 1. in 16bit/ECC, it will return -EBADMSG as the ECC codes have been broken.
> 2. in 1bit/ECC, it will not reture -EBADMSG because a hardware design problem.
>    I will explain the detail below.
> 
> Test logs:
> 1. in 16bit/ECC(print ECC codes):
> 
> /home # insmod mtd_nandbiterrs.ko dev=2 page_offset=1 seed=110 mode=0

...

> mtd_nandbiterrs: error: read failed at 0x800
> mtd_nandbiterrs: After 1 biterrors per subpage, read reported error -74

^^^ Ah, that's what I would expect from a driver that doesn't implement
the raw() functions.

> mtd_nandbiterrs: finished successfully.
> ==================================================
> insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error
> 
> 2. in 1bit/ECC(print ECC codes):
> /home # insmod mtd_nandbiterrs.ko dev=2 page_offset=1 seed=110 mode=0

...

> mtd_nandbiterrs: ECC failure, read data is incorrect despite read success
> insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error
> 
> Reason about above 1bit/ECC test result:

...

> It can not correct this kind of 2bit errors in 1bit/ECC mode in this NAND
> controller, however, it will trigger a correctable interrupt. As a result,
> software can not find this 1bit error in page data.

IOW, uncorrectable errors are getting reported as corrected bitflips?
That does sound bad.

> This is a hardware problem of this NAND controller.
> I plan to remove the 1bit/ECC mode support in patch of next version.

OK, sounds good. 1-bit HW ECC is not really very useful these days
anyway, if your higher-bit ECC can serve to replace it. Can the ECC
bytes still fit in the same spare area, though?

> > Are you saying you cannot implement the raw() hooks for this IP? Or just
> > that you haven't yet? The latter is probably OK for now (I'd recommend
> > doing this, or at least mark a TODO in the code), but the former is a
> > little disturbing.
> 
> The function of raw() hooks is just writing the page data to flash, is this right?

Right, just data (and OOB, if calling the _oob_ functions) without any
ECC parity bytes.

> In none ECC mode, it can write page date alone to flash. But in ECC mode, NAND
> controller will produce related ECC code automatically, write page data and ECC code
> to flash. In ECC mode, it can not write page date alone to flash for this NAND controller.

Perhaps you can switch between ECC mode and non-ECC mode?

At any rate, this isn't absolutely required.

> As a result, the nandbiterrs test can not pass.
> 
> I don't know if I have explained these two problems clearly. If still have something
> confused, please let me know.

Brian



More information about the linux-mtd mailing list