Bit flip detection after boot

Raphael Pereira raphael at rmi.inf.br
Fri Apr 15 18:55:30 PDT 2016


Hi,

I have been able to produce 2 different kernel binaries that produces
different results.

On 3.10.98 I get NO bit-flip whatsoever AND I can write the bootable
image to block 0 and it works, in other words, CPU Boot ROM
understands written data AND ECC in block 0 and performs a successful
flash boot. Besides boot, the system is running perfectly with no
error message.

On 4.0.4 and 4.1.18, both kernels compiled with the same
configuration, bit flips occurs AND Boot ROM doesn't recognize written
image, so I cannot boot from flash. System runs fine (I didn't try
long runs) but it show a huge amount of bit-flips every time system is
booted.

It sounds like there is a problem in the low level drivers for kernels
above 3.10, not sure which version is the first to show this
behaviour. I notice also that the GPMI driver is quite different on
4.x. I think, IMHO, that in some point in GPMI development tests
stopped being done for mx23 architecture and there might be some kind
of bit swap in ECC writing. Maybe that particular differences between
mx28/mx6/etc.

It would be great to know when the driver got that massive change and
until which kernel version mx23 is/was supported.

Best Regards,

2016-04-14 5:51 GMT-03:00 Richard Weinberger <richard.weinberger at gmail.com>:
> On Wed, Apr 13, 2016 at 11:25 PM, Raphael Pereira <raphael at rmi.inf.br> wrote:
>> Hi,
>>
>> I have a custom board with a GPMI interface (Freescale i.MX233) with
>> kernel 4.1.18 using Micron MT29F4G08ABADAWP (nand: 512 MiB, SLC, erase
>> size: 128 KiB, page size: 2048, OOB size: 64).
>>
>> When I boot, the flash gets recognized and works flawless. I can
>> ubiformat, ubiattach, mount, copy a lot of files, umount, remount,
>> check and test in whatever way I need. No problem arrises.
>> The problem is that when I shut the system down (turn it off) and boot
>> again, all used blocks are marked as badblocks.
>
> Sounds like you have a massive bitflip problem.
> This is where you should start investigating.
> Are you facing so massive bitflips that the bad block marker is wrongly set?
>
>> I have used another flash (Toshiba with w/ erase size: 256KiB) and it
>> works perfectly, with no problem at all.
>>
>> Trying to find out what might be the problem, I found BBT DTS option,
>> so I enabled it and, instead of marking all blocks as bad, I end up
>> with UBI detecting bit-flips on all sectors, but correctable flips, so
>
> You've disabled the bad block scan and tried to paper over the root cause. ;)
>
>> I can use the flash normally (although it takes a lot of time to boot
>> because of the many messages "ubi0: fixable bit-flip detected at PEB
>> XXX".
>>
>> I have changed the chip a lot of times and have checked my hardware
>> and I am quite sure it is not a hardware problem. I think that some
>> structure detection of the flash is wrong in the kernel for this
>> particular case (i.MX233 BCH + MT29F4G08ABADAWP).
>>
>> Can someone guide me in debugging this problem?
>
> As I said, find out why you're facing so much bitflips.
>
> --
> Thanks,
> //richard



-- 
Raphael Derosso Pereira
raphael at rmi.inf.br
+55-41-8877-1762



More information about the linux-mtd mailing list