ARM - IMX - NAND - Kernel 4.x - ONFI PARAMETER PAGE FAILS TO READ !

Boris Brezillon boris.brezillon at free-electrons.com
Thu Jul 20 00:39:34 PDT 2017


On Thu, 20 Jul 2017 06:13:53 +0000
"Vellemans, Noel" <Noel.Vellemans at visionBMS.com> wrote:

> Hi All
>  
> I've  running 2.6.35.x for some time now on our IMX53 custom boards ( booting from nand)
>  
> Recently I  started in UPGRADING the kernel to a more recent version ( 4.4.75 and/or a 4.12 series kernel).
> Al fine so far.
>  
> I'm using 10 boards (100% identical boards)  to test drive the new kernel.
> 9 out of the 10 boards are running fine with this new KERNEL , 1 board is failing to recognize the NAND-FLASH ( 8 bits , 2 chips , hardware ECC enabled, Micron MT29F16G08ABACAWP) with this NEW kernel ( with the old kernel all seems to be fine...)
> 
>  The reason for this failure is that when trying to read the ONFI-Parameter PAGE, there seems to be a one BYTE offset into the bytes READ from the NAND-CHIP ( command NAND_CMD_PARAM)
> For 9 of the 10 boards... the data read back STARTS ( as specified ) with ONFI ( and are working fine for multiple days.. )
>  
> For the failing CPU/BOARD it starts with NFI ( O is missing) ( all 256 bytes are shifted one byte , or otherwise said, the FIRST byte is missing ... ( if the First Byte would be there all would be OK.. so it is no rubbish.. ))
>  
> Reading Manufacturer ID: 0x2c, Chip ID: 0x48 , is working... reading ONFI PARAMETER PAGE... is failing ! ( with the 4.4.x-kernel) 
>  
> I do have swapped the FLASHES  and the ERROR stays with the CPU/BOARD.
>  
> { Note putting the OLD kernel back ... 2.6.35.x .. and all is working fine.. must be related to NEW-kernel drivers , but could be a silicon bug triggered by some exception if you ask me .. been digging for more than a week on this}
>  
> I've been cross checking ERATA's but can not find anything that would fit.
> I've been triple checking each NFC register as well .. all registers are setup correctly  ( comparing good/ bad board.=> same register settings)  !
>  
>  
> Any clue ? any hints .. to get me  going  ( as said before,  i've been searching for one week on this.. no luck so far, in understanding / solving the issue .. ! )
> 
> Details ? ( please see below)
>  
> Just for info, type of NAND used ( 2 chips , 8 bit mode) :
> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0x48
> nand: Micron MT29F16G08ABACAWP
> nand: 2048 MiB, SLC, erase size: 512 KiB, page size: 4096, OOB size: 224
>  
> 
> At the error the  Kernel bails out with : 
> 
> [    1.646968] nand: Could not find valid ONFI parameter page; aborting
> [    1.653593] nand: No NAND device found
> 
> 
> When looking up good ( working) vs  BAD ( not working) .. I come to this. ( I've added some printk's to show the error details)
> 
> * For a good 'one '  I get this DUMP of the ONFI-parameters-read-back
> 
> [    2.909684] NAND_CMD_PARAM- data[0] = 0x4F => O 
> [    2.914510] NAND_CMD_PARAM- data[1] = 0x4E => N 
> [    2.919232] NAND_CMD_PARAM- data[2] = 0x46 => F 
> [    2.923986] NAND_CMD_PARAM- data[3] = 0x49 => I
> [    2.928706] NAND_CMD_PARAM- data[4] = 0x1E
> [    2.933456] NAND_CMD_PARAM- data[5] = 0x00
> [    2.938175] NAND_CMD_PARAM- data[6] = 0x58
> ..
> ... some bytes/lines are stripped here
> ..
> [    4.149180] NAND_CMD_PARAM- data[254] = 0x20 (crc is/or should be here on this offset)
> [    4.154101] NAND_CMD_PARAM- data[255] = 0x12 (crc is/or should be here on this offset)
> 
>  
>  
> 
> * For the bad-one ( on kernel 4.12 / 4.4.x , but working on 2.6.35). I get this DUMP of the ONFI-parameters-read-back
> 
> 
> [    1.819926] NAND_CMD_PARAM- data[0] = 0x4E =>N
> [    1.824666] NAND_CMD_PARAM- data[1] = 0x46 => F
> [    1.829405] NAND_CMD_PARAM- data[2] = 0x49 => I
> [    1.834143] NAND_CMD_PARAM- data[3] = 0x1E
> [    1.838882] NAND_CMD_PARAM- data[4] = 0x00
> [    1.843619] NAND_CMD_PARAM- data[5] = 0x58
> ..
> ... some bytes/lines are stripped here
> ..
> [   3.053545] NAND_CMD_PARAM- data[253] = 0x20 ????? ( crc byte also on the wrong offset!!!)
> [    3.058458] NAND_CMD_PARAM- data[254] = 0x12 (crc is/or should be here on this offset)
> [    3.063371] NAND_CMD_PARAM- data[255] = 0x4F ( O of the second ONFI parameter block)

Hm, it seems you're missing the first byte. It might be that your
controller is configured in some kind of EDO mode, and I'm pretty sure
the param page should be read in mode 0 (this implies EDO mode
disabled).

Maybe you can try playing with NFC_V3_DELAY_LINE, but honestly, I don't
know the MXC NAND controller enough to tell what could explain this
behavior.

> 
>  
> 
> 
> The strange thing is 9 out of 10 boards are OK , but 1 out of 10 is BAD ... on these recent kernels.
> 
> When running the older 2.6.35 kernel.. even on this BAD-board (lets say) .. all is working fine.
> 
> Best Regards
> Noel
> 




More information about the linux-mtd mailing list