crash after receiving SIGCHLD during system call
Russell King - ARM Linux
linux at armlinux.org.uk
Wed May 17 16:02:36 PDT 2017
On Wed, May 17, 2017 at 04:28:44PM -0600, David Mosberger wrote:
> OK, since I see various faults, including SIGILL, I used
> user_debug=63. Here is one example:
>
> 2017-05-17 22:12:20: (log.c.217) server started
> [ 129.810000] pgd = cf0b4000
> [ 129.810000] [00000073] *pgd=2f903831, *pte=00000000, *ppte=00000000
> [ 129.820000] CPU: 0 PID: 701 Comm: lighttpd Not tainted 4.9.28+ #58
> [ 129.820000] Hardware name: Atmel SAMA5
> [ 129.830000] task: cecfcd80 task.stack: cf102000
> [ 129.830000] PC is at 0x18af4 <-- points to "movle r6, r3" instruction
> [ 129.830000] LR is at 0xb6c04510
> [ 129.840000] pc : [<00018af4>] lr : [<b6c04510>] psr: 00070030
> [ 129.840000] sp : bee098ec ip : ffffffff fp : 01ee4740
> [ 129.850000] r10: 00000008 r9 : 00000000 r8 : b6d12c40
> [ 129.850000] r7 : 00034684 r6 : 00000062 r5 : ffffffff r4 : 00000000
> [ 129.860000] r3 : ff000000 r2 : bee09978 r1 : bee098f8 r0 : 00000073
> [ 129.870000] Flags: nzcv IRQs on FIQs on Mode USER_32 ISA Thumb
> Segment user
...
> Program received signal SIGSEGV, Segmentation fault.
>
> I'm not very good at reading ARM tombstones but if I read this right,
> the kernel got a page fault due to a data access but a "movle r6, r3"
> instruction doesn't access data memory. Are we dealing with a
> instruction cache issue?
>
> And it says we're in "Thumb" mode? That shouldn't be the case.
That does appear to be the case - the PSR value confirms it.
The segfault is at address 0x73, and an ARM "movle r6, r3" instruction
assembles to 0xd1a06003, which would correspond with Thumb:
0: d1a0 bne.n ffffff44 <.text+0xffffff44>
2: 6003 str r3, [r0, #0]
Since r0 is 0x00000073, this ties up.
So, the problem seems to be the T bit in the PSR is somehow getting
set.
The kernel signal handling merely saves the PSR value it got on entry
(in the pt_regs structure) onto the userspace stack as part of the
mcontext (see setup_sigframe in arch/arm/kernel/signal.c). I think
you've confirmed that the saved information looks correct.
The question then becomes what happens after the signal handler
returns. If there is no sigreturn or rt_sigreturn syscall, then the
return is being done entirely by userspace, which means userspace is
responsible for unstacking the mcontext, including switching to the
correct ISA.
If there is a sigreturn syscall, the kernel will unstack the mcontext,
(see sys_*sigreturn in arch/arm/kernel/signal.c) replacing the syscall's
pt_regs with the saved mcontext registers. The resulting state is
validated (to prevent userspace gaining privileged modes) before
returning. So the T bit should be restored, unless something in
userspace decided to set it.
The validation will fix up the CPSR state if it looks bad (as a belt
and braces) before returning zero to indicate illegal state, which
will result in a forced SIGSEGV being delivered to the program.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
More information about the linux-arm-kernel
mailing list