[BUG] Nand support broken with v2.6.36-rc1
Brian Norris
norris at broadcom.com
Tue Aug 17 13:00:39 EDT 2010
Hello,
On 08/17/2010 01:52 AM, Michael Guntsche wrote:
> The only thing that might be special with the nand driver that is being
> used is that a different oob layout is being used.
>
> static struct nand_ecclayout rbppc_nand_oob_16 = {
> .eccbytes = 6,
> .eccpos = { 8, 9, 10, 13, 14, 15 },
> .oobavail = 9,
> .oobfree = { { 0, 4 }, { 6, 2 }, { 11, 2 }, { 4, 1 } }
> };
On 08/17/2010 04:36 AM, Michael Guntsche wrote:
> I added this to the nand driver itself.
>
> static uint8_t scan_ff_pattern[] = { 0xff, 0xff };
> static struct nand_bbt_descr rbppc_nand_smallpage = {
> .options = NAND_BBT_SCAN2NDPAGE,
> .offs = NAND_SMALL_BADBLOCK_POS,
> .len = 1,
> .pattern = scan_ff_pattern
> };
>
> and the driver is working again. But shouldn't this be supported by the stock level code as well?
Why yes, it should! Somebody (probably me) goofed. Your nand_ecclayout
is conflicting with the kernel's choice of bad block position. Recent
changes must have affected which position is chosen automatically by the
kernel.
One of the following two cases is likely the problem:
(1) Your chip is supposed to use offset 0, not 5, for the BBM (i.e.,
NAND_LARGE_BADBLOCK_POS, not NAND_SMALL_BADBLOCK_POS), and so your
ecclayout should not be leaving byte 0 in the "oobfree" array (a design
flaw since you first began using this chip)
(2) I made the commit that you mentioned
(c7b28e25cb9beb943aead770ff14551b55fa8c79) too restrictive in allowing
chips to use NAND_SMALL_BADBLOCK_POS.
Option 2 is likely the case, and in fact, I realized a stupid mistake I
made in refactoring the detection here.
I have been studying data from hundreds of flash chips to find where the
factory-determined markers should be stored. Unfortunately, I can't
cover all of them, and so your Hynix chip is likely one that was
overlooked. Could you send the full NAND ID string (8 bytes, not just
the manufacturer and chip ID), an exact part number for the flash, and a
datasheet? Any one of those could help (the datasheet being the most
important), but whatever you can provide is helpful. More data on your
chip would allow me to determine the problem for sure; I will send a
patch ASAP once I get your information.
Sorry for the trouble!
On another note, it may be intelligent to have the kernel-specific
systems check for such a conflict between bad-block markers and ECC
layout. If a position needed by the bad-block marker is listed in
"oobfree" or "eccpos" then we have a problem. Sound like a good idea
anybody? If so, what would be the best approach:
* print an error and quit detection
* try to modify the ecclayout, bbm info or both
* try to modify, and fall-back to error message and quit if necessary
Thanks,
Brian
More information about the linux-mtd
mailing list