AW: UBI leb_write_unlock NULL pointer Oops (continuation)

Bill Pringlemeir bpringlemeir at nbsps.com
Tue Feb 11 10:25:44 EST 2014


>> Am 04.02.2014 18:01, schrieb Wiedemer, Thorsten (Lawo AG):

>>> I made a "hardcore test" with:
>>>> while [ 1 ]; do cp <file_of_8kByte_size> 
>>> tmp/<file_of_8kByte_size.1>; sync; done & $ while [ 1 ]; do cp 
>>> <file_of_8kByte_size> tmp/<file_of_8kByte_size.2>; sync; done & $ 
>>> while [ 1 ]; do cp <file_of_8kByte_size> tmp/<file_of_8kByte_size.3>;
>>> sync; done &

>>> It took about 2-3 hours until I had an error (two times):

>> -----Ursprüngliche Nachricht-----
>> Von: Richard Weinberger [mailto:richard at nod.at] 

>> This test ran the over night without any error on my imx51 board. :-\

>> Thorsten, can you please enable CONFIG_DEBUG_LIST?
>> Also try whether you can trigger the issue with lock debugging
>> enabled.

On 11 Feb 2014, Thorsten.Wiedemer at lawo.com wrote:

> short update (I was out of office the rest of last week).  I compiled
> the kernel with the debug flags for debug list and lock alloc.  The
> kernel compiled with gcc-4.8.2 didn't start (no output on serial
> console and reboot of the system).  I didn't try (yet) to find out
> what happens at startup.

You don't need to enable the 'lock alloc' debugging; Just the 'debug
list' as Richard suggested.  One at a time would work and give clues if
you can reproduce it.

> I compiled the same kernel (and same config) with gcc-4.4.4. The write
> test runs now for over 16 hours without error.  Next step is to find
> out wether this is due to a changed timing because of the debug flags
> or if it's the compiler.

I ran a test as per the above on an IMX25 and mxc_nand has 448179139
interrupts, with about 6 bit flips and one torture test.  It was been
running for about four days.  I am using gcc 4.7.3 (crosstool-ng) and
backports to 2.6.36.  

I think that the issue is not related to an MTD driver and/or UBI/UbiFS
directly.  It is more likely an architecture issue and maybe some API
inconsistency.  It could be compiler related, however, it seems many
people have seen the issue on various ARM926 systems (different Linux
versions, different compilers, and different MTD drivers).

User space tasks running in parallel with the test may play a role.  Did
you turn CONFIG_PREEMPT off?  I think memory pressure and other effect
(not related to UBI/UbiFS) maybe needed to trigger the issue.  We don't
normally see this on our systems.  The one time it happened, an
application developer ran some 'ls -R' or 'find .' in parallel with a
file intensive feature in our application.  I haven't found a test to
reproduce it reliably.

Fwiw,
Bill Pringlemeir.



More information about the linux-mtd mailing list