[PATCH v3 0/6] NAND BBM + BBT updates

Wed Jan 18 17:04:50 EST 2012

On Tue, Jan 17, 2012 at 2:22 AM, Angus CLARK <angus.clark at st.com> wrote:
> (Indeed, this issue was raised recently in a meeting with one of the major NAND
> manufacturers, and the design enginner was horrified at the thought of relying
> on the OOB for tracking worn blocks.)

That's interesting. I never had this impression, but perhaps the topic
just never came up.

> The use of OOB BB markers certainly has some benefits (as already mentioned in
> previous posts), and I like the idea of being able to use OOB markers in
> conjunction with BBTs.  However, IMHO, I believe the BBT should be regarded as
> the primary source of information, especially when considering inconsistencies
> between the OOB markers and the BBTs.

It looks like the facts are leaning toward flash-based BBT being the
preferable source of info, at least. But due to some practical
concerns (over reliability of BBT, resistance to corruption, and
non-Linux interaction with flash), I feel like we can't say 100% that
BBT is the primary source of bad block info. Now, if we can mitigate
the reliability/corruption issues, that leaves non-Linux (e.g.,
bootloader) interaction with flash.

Anyway, the important question is: how does this impact the current
solution I am developing? IMO, this seems primarily a matter of
perspective, which would drive future development but does not
fundamentally alter my proposed patch(es). The choice of "primary
source" may affect the order in which we update them and the handling
of power cuts, but otherwise, we want the same result regardless of
the "primary."

Another note regarding the primary source: if the BBT is sufficiently
corrupted (according to ECC), we fall back to the OOB markers. That
doesn't make the flash-based BBT the 100% primary source, but I think
it makes sense. This feature was pulled into the 3.2 release, BTW.

> On 01/11/2012 10:28 PM, Artem Bityutskiy wrote:
>> 1. When we get erase error. Well, if SW erases a block, it does not care
>> of the contents. This means that if after the reboot SW will re-try
>> erasing it. And if the block is bad, and previously the erasure failed,
>> it will fail again, and SW will mark it as bad again.
...
> In other words, we cannot rely on erase failures as a way of recovering bad
> block status, although I accept in some circumstances, it is probably the best
> we can do!

I think that if there are really power-cut issues while marking a bad
block, we will often have to resort to the imperfect "best we can do".
If we don't have any more fundamental objections, I will resend soon,
where we will write to OOB first, then to BBT. There will be an option
to simply disable writing markers to OOB.

Thanks,
Brian