[PATCH/RFC v4 1/3] Shared BCH ECC library
Ricard Wanderlof
ricard.wanderlof at axis.com
Tue Mar 29 09:55:19 EDT 2011
On Fri, 11 Mar 2011, Ivan Djelic wrote:
> The first patch of this series contains a new generic BCH ECC library.
>
> This library can be used to provide software BCH correction on NAND
> devices (see 2nd and 3rd patch), as well as error correction for hybrid
> hardware BCH engines.
> ...
I've been trying to apply this patch series in order to evaluate how fast
BCH is compared to the existing 1-bit ECC algorithm in order to get some
idea of the performance hit incurred in practice once higher-order ECC
correction becomes necessary.
I've therefore applied the patch series on a Mips system running Linux
2.6.35 (we don't have the latest kernel ported to our platform yet).. Yes,
I realize that's not the latest kernel version, on the other hand, i can't
see anything in the patch that seems to depend on anything recent in mtd.
You basically have a BCH library, and a small nand driver implementing ECC
functionality using the library, plus corresponding modifications to some
mtd nand files to incorporate the library. The patch didn't apply cleanly,
but it was fairly obvious where all the bits should go.
The problem I have is I can't get the correction to work properly. I've
set up our nand flash driver to use the default setting of eccsize = 512
bytes and eccbytes = 7 to get 4-bit error correction capability per 512
bytes.
When writing to the flash, I get 28 bytes per 2048 byte page, at the end
of each OOB as expected.
When reading, if there are no bit errors in the read data, everything
works fine.
However, if I introduce a single bit error in the page, bch_decode() fails
with -EBADMSG, and some further debugging reveals that
bch.c:compute_error_locator_polynomial() returns 4 in this particular
case, whereas bch.c:find_poly_roots() returns 0, the two don't match, and
the function exits with an error. I'm no wizard with the algorithms used
so i have no idea what is reasonable. I would assume both would return 1,
as there is one bit error that I've introduced.
I made a test, removing the return 0 in the if clause in the middle of
bch_decode() to see what would happen if I don't just return early when
all is ok. In this case the aforementioned routines seem to return 0 as
expected.
I've dumped the read and calculated ECC and it looks like they are being
generated as expected; indeed, if there was a fault there reading ok pages
would also fail.
I'm a bit bewildered, as the algorithm appearently has been tested on a
Mips (albeit under QEMU). Of course it's very likely that I've made a
mistake somewhere, in that case it must be in the set-up, as the two files
which actually implement the algorithm are new and not patches to existing
files. I was thinking it was perhaps an endianess problem (our MIPS is
little-endian), but I see it's been tested on x86 too so it shouldn't be
that.
Any ideas?
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
More information about the linux-mtd
mailing list