AW: AW: AW: UBI leb_write_unlock NULL pointer Oops (continuation)
Wiedemer, Thorsten (Lawo AG)
Thorsten.Wiedemer at lawo.com
Thu Feb 20 10:21:39 EST 2014
Hi,
I'm back again now.
> Bill Pringlemeir wrote:
>
> $ printf "\x04\x70\x8a\xe4\x04\x50\x98\xe5\x05\x00\x5a\xe1\x29\x00\x00\x0a\x0c\x30\x95\xe5" > crash.dump $ objdump --disassemble-all -m arm -b binary crash.dump
>
> crash.dump: file format binary
>
>
> Disassembly of section .data:
>
> 00000000 <.data>:
> 0: e48a7004 str r7, [sl], #4
> 4: e5985004 ldr r5, [r8, #4]
> 8: e15a0005 cmp sl, r5
> c: 0a000029 beq 0xb8
> 10: e595300c ldr r3, [r5, #12]
>
> 'r5' is NULL. It seems to be the same symptom. If you run your ARM objdump with -S on either vmlinux or '__up_write', it will help confirm that it is the list corrupted again. The assembler above should match.
I don't have running a objdump on my ARM system at the moment, but rwsem-spinlock.c compiled with debug info, objdump -S -D gives for __up_write():
...
sem->activity = 0;
29c: e3a07000 mov r7, #0
2a0: e1a0a008 mov sl, r8
2a4: e48a7004 str r7, [sl], #4
2a8: e5985004 ldr r5, [r8, #4]
if (!list_empty(&sem->wait_list))
2ac: e15a0005 cmp sl, r5
2b0: 0a000029 beq 35c <__up_write+0xe0>
/* if we are allowed to wake writers try to grant a single write lock
* if there's a writer at the front of the queue
* - we leave the 'waiting count' incremented to signify potential
* contention
*/
if (waiter->flags & RWSEM_WAITING_FOR_WRITE) {
2b4: e595300c ldr r3, [r5, #12]
{
...
Seems to match ...
> What is 'RAVENNA_streame'? Is this your standard test and not the '8k binary' copy test or are you doing the copy test with this process also running?
This is an application which runs parallel to our copy test. The last days, Emanuel set up another test environment which seems to reproduce the error more reliably (at least on some hardwares, not on all).
At the moment, there are running proprietary applications in parallel, but I'll try to strip it down to a sequence which I can provide you, if you like.
> We have 'IRQs off', which makes sense for __up_write. Trying 'ftrace_dump_on_oops' as Richard suggests would be helpful to find out what went on before. It might also make sense to dump some 'rwsem_waiter' nodes on the error? It looks like '__up_write' might normally have an empty list? > Certainly an non-empty 'rwsem_waiter' is going to trigger the condition more often? I guess I can look to see what might cause this, even if I can not reproduce it. The 'preemp_count' has been two every time you have this; is that true?
We could reproduce the error now with function tracing enabled, so we have two hopefully valuable traces. But they are rather big (around 4MB each). Shall I use pastebin and cut them in several peaces to provide them? Or off-list as email attachment?
The trace Emanuel posted Wednesday may be not valuable. Perhaps there is a (different) error triggered due to memory pressure caused by the function tracing.
Best regards,
Thorsten
More information about the linux-mtd
mailing list