UBI leb_write_unlock NULL pointer Oops (continuation)

Wed Feb 5 16:45:26 EST 2014

> Am 04.02.2014 18:01, schrieb Wiedemer, Thorsten (Lawo AG):

>> I made a "hardcore test" with:
>> $ while [ 1 ]; do cp <8kByte_file> tmp/<8kByte_file.1>; sync; done &
>> $ while [ 1 ]; do cp <8kByte_file> tmp/<8kByte_file.2>; sync; done &
>> $ while [ 1 ]; do cp <8kByte_file> tmp/<8kByte_file.3>; sync; done &

>> It took about 2-3 hours until I had an error (two times):

On  5 Feb 2014, richard at nod.at wrote:

> This test ran the over night without any error on my imx51 board. :-\

> Bill's great analysis showed that it may be a linked list corruption
> in rw_semaphore.  Thorsten, can you please enable CONFIG_DEBUG_LIST?
> Also try whether you can trigger the issue with lock debugging
> enabled.

I am trying to run the same test.  I have 'fastmap' enabled and
'kmemleak'.  I have various occurrences of these two,

unreferenced object 0xc2c06e50 (size 24):
  comm "sync", pid 2941, jiffies 855335 (age 6354.950s)
  hex dump (first 24 bytes):
    5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 79 00 00 00  ZZZZZZZZZZZZy...
    07 00 00 00 5a 5a 5a a5                          ....ZZZ.
  backtrace:
    [<c019a544>] kmem_cache_alloc+0x10c/0x1a0
    [<c02b2b6c>] ubi_update_fastmap+0xdc/0x14f4
    [<c02ac204>] ubi_wl_get_peb+0x28/0xbc
    [<c02a64c0>] ubi_eba_write_leb+0x23c/0x884
    [<c02a51a4>] ubi_leb_write+0xc4/0xe0
    [<c0200f38>] ubifs_leb_write+0x9c/0x130
    [<c020b28c>] ubifs_log_start_commit+0x230/0x3f4
    [<c020c368>] do_commit+0x134/0x870
    [<c01fbfa0>] ubifs_sync_fs+0x88/0x9c
    [<c01c30bc>] __sync_filesystem+0x74/0x98
    [<c01a2860>] iterate_supers+0x9c/0x104
    [<c01c31f4>] sys_sync+0x3c/0x68
    [<c0129300>] ret_fast_syscall+0x0/0x1c
    [<ffffffff>] 0xffffffff
unreferenced object 0xc2c06df0 (size 24):
  comm "flush-ubifs_0_3", pid 260, jiffies 867487 (age 6233.430s)
  hex dump (first 24 bytes):
    5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 7a 00 00 00  ZZZZZZZZZZZZz...
    1e 00 00 00 5a 5a 5a a5                          ....ZZZ.
  backtrace:
    [<c019a544>] kmem_cache_alloc+0x10c/0x1a0
    [<c02b2b6c>] ubi_update_fastmap+0xdc/0x14f4
    [<c02ac204>] ubi_wl_get_peb+0x28/0xbc
    [<c02a64c0>] ubi_eba_write_leb+0x23c/0x884
    [<c02a50c0>] ubi_leb_map+0x70/0x90
    [<c020125c>] ubifs_leb_map+0x74/0x100
    [<c020af44>] ubifs_add_bud_to_log+0x1f4/0x30c
    [<c01f4830>] make_reservation+0x2e0/0x3e0
    [<c01f53e8>] ubifs_jnl_write_data+0xfc/0x25c
    [<c01f838c>] do_writepage+0x88/0x260
    [<c017b368>] __writepage+0x18/0x84
    [<c017b98c>] write_cache_pages+0x1b4/0x3ac
    [<c01bead4>] writeback_single_inode+0x9c/0x258
    [<c01befac>] writeback_sb_inodes+0xbc/0x180
    [<c01bf6c0>] writeback_inodes_wb+0x7c/0x178
    [<c01bfa00>] wb_writeback+0x244/0x2ac

It is a 'cache' so I am suspicious of the kmemleak (also my Linux is old
[kmemleak] with the Ubi/UbiFs/Mtd patches).  However, I just wondered if
Thorsten has posted a .config somewhere?  I am testing on an IMX25
system as well and trying to replicate with his test.  The Linux version
is different as well.  I suspect Richard will have tried with 'fastmap'
as well?  Are you running without 'fastmap' Thorsten?  I will let my
system run over night.  Maybe just,

 $ grep -E 'MTD|UBI' .config | grep -v '^#'

is fine for your config?  Or maybe a full config to pastebin or
someplace?

I am pretty sure that http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854
is not the cause of this issue; although it is a good thing to be aware
of.  You can apply the patch in the crosstool-ng directory to fix
gcc-4.8.  It is quite possible that the FSL/Linaro people have done
this.  The 4.8.2 doesn't seem to come with this patch in the vanilla
tarball.

Also, I have had this occur with gcc 4.7.  Especially, this same sort of
issue has been occurring for some time (before gcc 4.8's release on
2013-03-22).  Another memory issue was a suspect, but now it has been
fixed and we still seem to have the issue.

Fwiw,
Bill Pringlemeir.