[PATCH/RFC v4 1/3] Shared BCH ECC library

Ricard Wanderlof ricard.wanderlof at axis.com
Tue Mar 29 09:55:19 EDT 2011


On Fri, 11 Mar 2011, Ivan Djelic wrote:

> The first patch of this series contains a new generic BCH ECC library.
>
> This library can be used to provide software BCH correction on NAND 
> devices (see 2nd and 3rd patch), as well as error correction for hybrid 
> hardware BCH engines.
> ...

I've been trying to apply this patch series in order to evaluate how fast 
BCH is compared to the existing 1-bit ECC algorithm in order to get some 
idea of the performance hit incurred in practice once higher-order ECC 
correction becomes necessary.

I've therefore applied the patch series on a Mips system running Linux 
2.6.35 (we don't have the latest kernel ported to our platform yet).. Yes, 
I realize that's not the latest kernel version, on the other hand, i can't 
see anything in the patch that seems to depend on anything recent in mtd. 
You basically have a BCH library, and a small nand driver implementing ECC 
functionality using the library, plus corresponding modifications to some 
mtd nand files to incorporate the library. The patch didn't apply cleanly, 
but it was fairly obvious where all the bits should go.

The problem I have is I can't get the correction to work properly. I've 
set up our nand flash driver to use the default setting of eccsize = 512 
bytes and eccbytes = 7 to get 4-bit error correction capability per 512 
bytes.

When writing to the flash, I get 28 bytes per 2048 byte page, at the end 
of each OOB as expected.

When reading, if there are no bit errors in the read data, everything 
works fine.

However, if I introduce a single bit error in the page, bch_decode() fails 
with -EBADMSG, and some further debugging reveals that 
bch.c:compute_error_locator_polynomial() returns 4 in this particular 
case, whereas bch.c:find_poly_roots() returns 0, the two don't match, and 
the function exits with an error. I'm no wizard with the algorithms used 
so i have no idea what is reasonable. I would assume both would return 1, 
as there is one bit error that I've introduced.

I made a test, removing the return 0 in the if clause in the middle of 
bch_decode() to see what would happen if I don't just return early when 
all is ok. In this case the aforementioned routines seem to return 0 as 
expected.

I've dumped the read and calculated ECC and it looks like they are being 
generated as expected; indeed, if there was a fault there reading ok pages 
would also fail.

I'm a bit bewildered, as the algorithm appearently has been tested on a 
Mips (albeit under QEMU). Of course it's very likely that I've made a 
mistake somewhere, in that case it must be in the set-up, as the two files 
which actually implement the algorithm are new and not patches to existing 
files. I was thinking it was perhaps an endianess problem (our MIPS is 
little-endian), but I see it's been tested on x86 too so it shouldn't be 
that.

Any ideas?

/Ricard
-- 
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30



More information about the linux-mtd mailing list