[YAFFS] bad block management policy

Thu Aug 9 10:24:43 EDT 2012

On 08/09/2012 06:45 AM, peterlingoal wrote:
> Hi,
>
> I am using YAFFS2 filesystem and some NANDs have hundreds and
> thousands (out of 4K) blocks identified bad. After checking I found
> YAFFS2 is marking a block bad if three fixable ECC errors happens
> within a block. My question is:
>
> 1. I am using two Micron NAND chips, one requires minimum 1bit ECC
> while the other requires 4. Bit flipping (although all fixable) seems
> happen quite often in both types, is this expected behavior?
> 2. Micron error management doc requests to mark a block bad only when
> program or erase operations fails, but not mentioning reading. So is
> it safe to remove this ECC error counter? Will it lead to un-fixable
> error?
the "strike count" is used to predict when a block has been programmed
enough times that it is close to failure (where programmed data read
back contains uncorrectable bit errors).

This worked fine for the larger-geometry SLC devices that didn't show
correctable ECC errors until a block was very near its end of life. 
However newer small-geometry SLC/MLC devices require stronger ECC to
keep the same UBER (uncorrectable bit error rate) as previous generation
devices.  Unfortunately this means that more correctable errors will be
seen, long before the block is near its end of life.

You could modify YAFFS to ignore -EUCLEAN returns from MTD which will
prevent YAFFS from marking blocks bad prematurely, but then there is no
way to predict when a block is about to wear out and return
uncorrectable errors (-EBADMSG).

-- 
Peter Barada
peter.barada at gmail.com