state of support for "external ECC hardware"

Thu Nov 8 14:33:08 EST 2012

On Thu, Nov 08, 2012 at 07:22:50PM +0000, Christopher Harvey wrote:
> On Thu, Nov 08, 2012 at 07:59:42PM +0100, Ivan Djelic wrote:
> > On Thu, Nov 08, 2012 at 03:21:25PM +0000, Christopher Harvey wrote:
> > (...) 
> > > We had BCH8 code running, but it wasn't enough. The main reason we
> > > switched away from host side ECC was because we were getting bitflips
> > > within the ECC codeword data itself.
> > 
> > But the ECC bytes are part of the BCH codeword, therefore I don't understand
> > what the issue could be ? Are you sure bitflips were not in some unprotected
> > OOB area ?
> 
> Ok, the ECC bytes I had were stored in the OOB area and were
> unprotected. Any bit flips in the OOB area was a disaster. This was
> coming from a heavily modified forked kernel that had BCH8 bugs in the
> past. For example, I had to fix this one before the patch came out:
> http://arago-project.org/git/projects/linux-omap3.git?p=projects/linux-omap3.git;a=commitdiff;h=adc46d691d745604da1197d154fe712e10ec468d;hp=9e78267ed6302537474489e88bd59827315db15b
> I can't explain why this implementation fails on ECC byte corruption.

Oooh, I think I understand now... I had very similar issues with some BCH8 code on an OMAP3630 board.
The error correction code was buggy, and would trip on errors located in ecc bytes.
Actually, this (and performance issues) is what pushed me into writing lib/bch.c :)
BR,
-- 
Ivan