do page fault in atomic bug on arm

Alex Shi alex.shi at linaro.org
Sun Nov 26 04:07:43 PST 2017


CC Masami Hiramatsu for ftracetest part.

Hi Russell King,

Thanks a lot for quick response!

Regards
Alex

On 11/25/2017 03:27 AM, Russell King - ARM Linux wrote:
> On Fri, Nov 24, 2017 at 11:09:30PM +0800, Alex Shi wrote:
>> Full agree with your analysis. Is it possible to stain PC value with
>> heavy stress on thermal or sth else? the ARM64 board run well with
>> ftracetest of LTP.
> 
> In your first email, you said "x15 platform, which is a armv7 board."
> Here you say "ARM64 board" which isn't armv7.  There's x15 DTS under
> arch/arm/boot/dts, so I guess you mean 32-bit ARM, but who knows...
> 
> Anyway, I've tried running ftracetest on an OMAP4430 SDP board, and
> after a while with the patch I sent you, I get:
> 
> Internal error: Oops - BUG: 0 [#1] SMP ARM
> Modules linked in:
> CPU: 1 PID: 2948 Comm: ftracetest Not tainted 4.14.0+ #557
> Hardware name: Generic OMAP4 (Flattened Device Tree)
> task: ce41c100 task.stack: cc7b8000
> PC is at oops+0x0/0x4
> LR is at trace_hardirqs_on_caller+0x154/0x1e0
> pc : [<c0015adc>]    lr : [<c0086840>]    psr: 20000193
> sp : cc7b9fb0  ip : cc7b9f80  fp : 00000000
> r10: 00000000  r9 : cc7b8000  r8 : c0015c28
> r7 : 00000006  r6 : 00000004  r5 : 0009fed4  r4 : 00000001
> r3 : 00000000  r2 : cc7b9fb0  r1 : 60000193  r0 : 00000001
> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c5387d  Table: 8e7b804a  DAC: 00000055
> Process ftracetest (pid: 2948, stack limit = 0xcc7b8210)
> Stack: (0xcc7b9fb0 to 0xcc7ba000)
> 9fa0:                                     00000000 00000000 0009b008 0000000d
> 9fc0: 00000001 0009fed4 00000004 00000006 0009b3e0 000a17d4 00000000 beaf6e3c
> 9fe0: 0009fed0 beaf6e20 000319e0 b6e7199c 60000193 00000001 6b6b6b6b a56b6b6b
> Backtrace: no frame pointer
> Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> ---[ end trace 390efe5843605357 ]---
> 
> The other CPU also oopses:
> 
> Internal error: Oops - BUG: 0 [#3] SMP ARM
> Modules linked in:
> CPU: 1 PID: 1 Comm: init Tainted: G      D         4.14.0+ #557
> Hardware name: Generic OMAP4 (Flattened Device Tree)
> task: ced04c00 task.stack: ced06000
> PC is at oops+0x0/0x4
> LR is at trace_hardirqs_on+0x14/0x18
> pc : [<c0015adc>]    lr : [<c00868e0>]    psr: 20000193
> sp : ced07fb0  ip : ced07fa0  fp : 00000000
> r10: 00000000  r9 : ced06000  r8 : c0015c28
> r7 : 0000004e  r6 : bec0acd4  r5 : 000176b4  r4 : bec0ac3c
> r3 : 00000000  r2 : ced07fb0  r1 : 60000193  r0 : c0015aa8
> Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c5387d  Table: 8e2d804a  DAC: 00000055
> Process init (pid: 1, stack limit = 0xced06210)
> Stack: (0xced07fb0 to 0xced08000)
> 7fa0:                                     00000000 00000000 00000000 00000000
> 7fc0: bec0ac3c 000176b4 bec0acd4 0000004e 10000000 00000000 0000a1d0 bec0ac44
> 7fe0: bec0acd8 bec0ac28 b6e54544 b6e5456c 60000193 bec0ac28 005d5555 00020201
> Backtrace: no frame pointer
> Code: e9527fff e1a00000 e28dd048 e1b0f00e (e7f001f2)
> ---[ end trace 390efe5843605358 ]---
> 
> which is exactly your bug, but caught a bit earlier.
> 
> This happens while executing this ftrace test:
> 
> [28] Register/unregister many kprobe events
> 
> and needs a kernel with ftrace and kprobes enabled.
> 
> Unfortunately, the debug is immediately after a call to
> trace_hardirqs_on() in no_work_pending, so the LR value is
> meaningless.
> 
> So, now that we know it's tracing kprobes triggering it - it's
> trying to set tracepoints on the first 256 symbols in the kernel's
> kallsyms, which includes all sorts of things.
> 
> With some extra debug, this doesn't look clever:
> 
> trace_kprobe: Inserting kprobe at ret_fast_syscall+0
> trace_kprobe: Inserting kprobe at slow_work_pending+0
> trace_kprobe: Inserting kprobe at ret_slow_syscall+0
> trace_kprobe: Could not insert probe at ret_slow_syscall+0: -22
> trace_kprobe: Inserting kprobe at ret_to_user+0
> trace_kprobe: Could not insert probe at ret_to_user+0: -22
> trace_kprobe: Inserting kprobe at ret_to_user_from_irq+0
> trace_kprobe: Inserting kprobe at no_work_pending+0
> trace_kprobe: Inserting kprobe at oops+0
> trace_kprobe: Could not insert probe at oops+0: -22
> trace_kprobe: Inserting kprobe at ret_from_fork+0
> trace_kprobe: Inserting kprobe at vector_swi+0
> trace_kprobe: Inserting kprobe at local_restart+0
> trace_kprobe: Inserting kprobe at __sys_trace+0
> trace_kprobe: Inserting kprobe at __sys_trace_return+0
> trace_kprobe: Inserting kprobe at __sys_trace_return_nosave+0
> trace_kprobe: Could not insert probe at __sys_trace_return_nosave+0: -22
> trace_kprobe: Inserting kprobe at __cr_alignment+0
> trace_kprobe: Could not insert probe at __cr_alignment+0: -22
> trace_kprobe: Inserting kprobe at sys_call_table+0
> trace_kprobe: Inserting kprobe at sys_syscall+0
> trace_kprobe: Inserting kprobe at sys_sigreturn_wrapper+0
> trace_kprobe: Inserting kprobe at sys_rt_sigreturn_wrapper+0
> trace_kprobe: Inserting kprobe at sys_statfs64_wrapper+0
> trace_kprobe: Inserting kprobe at sys_fstatfs64_wrapper+0
> 
> I wouldn't be surprised if some of those were the cause of it -
> for example, what guarantee do we have that a trace kprobe inserted
> at ret_fast_syscall which starts with this:
> 
> c0015a40:       e5ad0008        str     r0, [sp, #8]!
> 
> will be handled correctly?  I can't say, I've virtually no knowledge
> about kprobes, but I guess it isn't - especially as there's this
> comment in the ARM kprobes code:
> 
>          * Never instrument insn like 'str r0, [sp, +/-r1]'. Also, insn likes
>          * 'str r0, [sp, #-68]' should also be prohibited.
> 
> Clearly, that's not the case as the kprobes insert on
> ret_fast_syscall succeeded.
> 
> Adding Tixy, as he's more knowledgable in this area - I suggest
> someone knowledgable in this area runs
> 
> 	ftracetest test.d/kprobe/multiple_kprobes.tc
> 
> and fixes these bugs... also running the entire ftracetest suite
> would probably also be a very good idea.
> 



More information about the linux-arm-kernel mailing list