AW: AW: UBI leb_write_unlock NULL pointer Oops (continuation)
Wiedemer, Thorsten (Lawo AG)
Thorsten.Wiedemer at lawo.com
Wed Feb 12 10:18:05 EST 2014
Hi,
> On 11 Feb 2014, Thorsten.Wiedemer at lawo.com wrote:
>
>> short update (I was out of office the rest of last week). I compiled
>> the kernel with the debug flags for debug list and lock alloc. The
>> kernel compiled with gcc-4.8.2 didn't start (no output on serial
>> console and reboot of the system). I didn't try (yet) to find out
>> what happens at startup.
>
> You don't need to enable the 'lock alloc' debugging; Just the 'debug list' as Richard suggested. One at a time would work and give clues if you
> can reproduce it.
I tested this, compiled with gcc.4-4-4. I had an error one time, but there was no bug report for list handling, only kernel oops:
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 Tainted: G O (3.6.11 #26)
PC is at __up_write+0x38/0x168
LR is at __up_write+0x20/0x168
pc : [<c0234df0>] lr : [<c0234dd8>] psr: a0000093
sp : c726bc20 ip : c79c83bc fp : 60000013
r10: c71fed3c r9 : 00000001 r8 : c71fed38
r7 : 00000000 r6 : c726a000 r5 : 00000000 r4 : c71fed20
r3 : 00000001 r2 : c726a000 r1 : 00000000 r0 : 00000002
Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 0005317f Table: 87104000 DAC: 00000015
Process RAVENNA_streame (pid: 852, stack limit = 0xc726a270)
Stack: (0xc726bc20 to 0xc726c000)
bc20: c7aeb000 c71fed20 c79bc800 c726a000 00000656 00000001 00000001 00000656
bc40: c7aeb000 c02dfc1c c79bc800 00000000 c79c4600 000004ea 00000001 c02e0c14
bc60: 00000800 00000000 c726bc70 c01e9800 32353128 32323132 74613135 31202c61
bc80: 31200029 00010029 00000001 c7ac6920 000002c2 c0238650 c053a278 c071efda
bca0: c071e000 c071eff3 c071efd7 0001c800 00000656 c7aeb000 00000800 00000800
bcc0: 00000001 c72cc000 0001c800 c02ded34 0001c800 00000800 c726a000 c7b32000
bce0: c7aeb000 0001c800 00000656 c01e6104 00000800 c0048054 c7ac6960 c7b32000
bd00: c7ac6920 c7b32000 00000358 00000355 00002910 c01e669c 00000800 00000000
bd20: 00000000 c7b32000 c726bd90 c72cc000 00000355 c726bde0 c7b32000 00000000
bd40: 00000001 c7ac6920 000006f0 c01d8b88 c726bd94 c70599a0 00000656 0001c800
bd60: 32353128 2c303132 74616420 32202c61 00000029 c022f74c 00000005 c05a4fc0
bd80: 00000001 c05a4fc0 c05a53c0 c05a53c0 00000325 00000001 a0000013 c05a53c0
bda0: c7b32000 c7578820 00001000 20000002 00025292 c071e000 00000005 c01dcacc
bdc0: 00001000 c05a56e0 c726a000 c7578820 00000000 c01dcd10 00000000 00004830
bde0: 00025292 20000002 c75788d4 c05a53c0 00000002 c726be4c 00000000 00000002
be00: c75788d4 c00a052c c726bed8 c00a1420 0000000e 00004830 00000000 ffffffff
be20: c726a000 c00a0518 c75788d4 c00cb538 00000002 00000001 00000001 c726be50
be40: 00000041 00000005 00000000 c05a56e0 c05a4fc0 c05a53c0 c0696480 c06952a0
be60: c7578820 c757887c 00000001 c00cb488 00004830 c7578820 00000000 c0099598
be80: 00004830 00000005 00000000 00000000 c75788d4 c726bed8 7fffffff 00000000
bea0: 00000000 00000000 b5bfe13c c00a16ec 91827364 c726beb4 c726beb4 c726bebc
bec0: c726bebc 00000000 00000000 c75788d4 ffffffff c0098dc0 7ffffffd 00000000
bee0: 00000000 00000000 ffffffff 7fffffff 00000001 00000000 00000005 c0098e08
bf00: ffffffff 7fffffff 00000001 00000000 00000000 c7578820 c7b32000 ffffffff
bf20: 7fffffff c0014428 c726a000 c01dceb4 ffffffff 7fffffff b5bf9860 ffffffff
bf40: 7fffffff b5bf9860 00000076 c00f4364 ffffffff 7fffffff 00000000 c00c9fc0
bf60: 00000000 ffffffff 7fffffff c00f438c ffffffff 7fffffff 00000000 c00cab74
bf80: 00000000 00000000 c7bb1ec0 c00f45d8 0000000a 00000001 b5bf9860 0000000a
bfa0: 00004830 c00142a0 0000000a 00004830 0000000a 00000002 b5bff4f4 00000000
bfc0: 0000000a 00004830 b5bf9860 00000076 0003ecb8 000505a0 b5bfeae0 b5bfe13c
bfe0: 00000000 b5bf95d8 b6f55f8c b6f56fe4 80000010 0000000a ffffffff ffffffff
[<c0234df0>] (__up_write+0x38/0x168) from [<c02dfc1c>] (leb_write_unlock+0xbc/0x13c)
[<c02dfc1c>] (leb_write_unlock+0xbc/0x13c) from [<c02e0c14>] (ubi_eba_write_leb+0xa0/0x53c)
[<c02e0c14>] (ubi_eba_write_leb+0xa0/0x53c) from [<c02ded34>] (ubi_leb_write+0xe4/0xe8)
[<c02ded34>] (ubi_leb_write+0xe4/0xe8) from [<c01e6104>] (ubifs_leb_write+0x9c/0x128)
[<c01e6104>] (ubifs_leb_write+0x9c/0x128) from [<c01e669c>] (ubifs_wbuf_write_nolock+0x358/0x6f8)
[<c01e669c>] (ubifs_wbuf_write_nolock+0x358/0x6f8) from [<c01d8b88>] (ubifs_jnl_write_data+0x1a0/0x298)
[<c01d8b88>] (ubifs_jnl_write_data+0x1a0/0x298) from [<c01dcacc>] (do_writepage+0x8c/0x224)
[<c01dcacc>] (do_writepage+0x8c/0x224) from [<c00a052c>] (__writepage+0x14/0x64)
[<c00a052c>] (__writepage+0x14/0x64) from [<c00a1420>] (write_cache_pages+0x1cc/0x458)
[<c00a1420>] (write_cache_pages+0x1cc/0x458) from [<c00a16ec>] (generic_writepages+0x40/0x60)
[<c00a16ec>] (generic_writepages+0x40/0x60) from [<c0098dc0>] (__filemap_fdatawrite_range+0x64/0x6c)
[<c0098dc0>] (__filemap_fdatawrite_range+0x64/0x6c) from [<c0098e08>] (filemap_write_and_wait_range+0x40/0x6c)
[<c0098e08>] (filemap_write_and_wait_range+0x40/0x6c) from [<c01dceb4>] (ubifs_fsync+0x44/0xb8)
[<c01dceb4>] (ubifs_fsync+0x44/0xb8) from [<c00f4364>] (vfs_fsync_range+0x40/0x44)
[<c00f4364>] (vfs_fsync_range+0x40/0x44) from [<c00f438c>] (vfs_fsync+0x24/0x2c)
[<c00f438c>] (vfs_fsync+0x24/0x2c) from [<c00f45d8>] (do_fsync+0x28/0x50)
[<c00f45d8>] (do_fsync+0x28/0x50) from [<c00142a0>] (ret_fast_syscall+0x0/0x2c)
Code: e48a7004 e5985004 e15a0005 0a000029 (e595300c)
---[ end trace 8ee04e42747b7c3c ]---
note: RAVENNA_streame[852] exited with preempt_count 2
Today I compiled the kernel with debug lock alloc flags. The test runs already for some hours, but no error.
It sounds perhaps strange, but there seems to be an issue with max/mean erase count.
To make the debug list tests I changed my hardware. The bug occurred rather quickly after some minutes.
On the system I used for my overnight tests, I never saw the bug again, even with a kernel I used before to reproduce the bug several times.
>> I compiled the same kernel (and same config) with gcc-4.4.4. The write
>> test runs now for over 16 hours without error. Next step is to find
>> out wether this is due to a changed timing because of the debug flags
>> or if it's the compiler.
This was the hardware which never showed the bug again ...
As mentioned above, I had the bug also with gcc-4.4.4 on a newer hardware.
> User space tasks running in parallel with the test may play a role. Did you turn CONFIG_PREEMPT off? I think memory pressure and other effect (not related to UBI/UbiFS) maybe needed to trigger the issue. We don't normally see this on our systems. The one time it happened, an application > developer ran some 'ls -R' or 'find .' in parallel with a file intensive feature in our application. I haven't found a test to reproduce it reliably.
I have not yet tested with preemption off, but that test will run this night.
>
> Fwiw,
> Bill Pringlemeir.
Tanks,
Thorsten
More information about the linux-mtd
mailing list