nand "BCH decoding failed" when using bch8_hw_romcode ecc mode

Trent Piepho trent.piepho at igorinstitute.com
Fri Mar 18 14:41:33 PDT 2022


On Fri, Mar 18, 2022 at 5:00 AM Sascha Hauer <sha at pengutronix.de> wrote:
>
> On Fri, Mar 18, 2022 at 11:32:36AM +0100, Tibault Damman wrote:
> > Gah, I just saw how that mail was sent, let me try that again:
> >
> >
> > Because the data looks right(?), despite the error prints, I tried
> > ubiformat again from barebox, then booted linux from SD, and attached
> > the ubi nand partition in linux... which worked fine.
> > All volumes and data were there.
> >
> > Very confused about what's going wrong here.
>
> You can correctly write to the NAND and can even correctly read the
> data, that's good news.

Yes, NAND hardware likely works fine.  This looks like a BCH layout flaw to me.

Some background:  (for Tibault, I'm sure Sascha knows this!)

Normal read/writes to NAND use some kind of ECC.  The real data is
written, unmodified, to NAND, and then also some extra bytes of ECC
information.  The real data and the ECC data are probably interleaved
in various complicated ways that are a pain to deal with.

Suppose the ECC data isn't done correctly.  It's generated
incorrectly, *written to the wrong spot*, read from the wrong
location, etc.  It doesn't work.  But we still write the real data
somewhere too.  And can then read it back from that same place.  Maybe
we are writing the real data to the incorrect locations, but as long
as we read it back from the same incorrect locations, it appears to
work.

So what happens?  We can write data, then read it back, but get many
ECC errors, because ECC is broken.

This looks like your problem.

ECC needs to work, as modern MLC/TLC/QLC NAND is not reliable enough
to use without ECC.  You can test one page one time and it worked with
no errors, but test the entire chip many times and it will become
clear it just isn't good enough.



More information about the barebox mailing list