A bug about system call on ARM

Russell King - ARM Linux linux at arm.linux.org.uk
Mon Jun 3 06:45:34 EDT 2013


On Mon, Jun 03, 2013 at 11:27:23AM +0100, Will Deacon wrote:
> Hi Russell,
> 
> On Mon, Jun 03, 2013 at 11:18:09AM +0100, Russell King - ARM Linux wrote:
> > On Thu, May 30, 2013 at 12:41:12PM +0100, Will Deacon wrote:
> > > +#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
> > > +	/*
> > > +	 * We may have faulted trying to load the SWI instruction due to
> > > +	 * concurrent page aging on another CPU. In this case, return
> > > +	 * back to the swi instruction and fault the page back.
> > > +	 */
> > > +9001:
> > > +	sub	lr, lr, #4
> > > +	str	lr, [sp, #S_PC]
> > > +	b	ret_fast_syscall
> > > +#endif
> > 
> > The comment is wrong.  If we get here, it means that the fault from
> > trying to loading the instruction can't be fixed up.  Arguably, that
> > should result in a SIGSEGV being sent immediately, but we'll get to
> > that when we then try to re-load the instruction.
> 
> Why would we kill the application in this case? The reported problem is
> where one CPU ages the page containing the swi instruction (mkold =>
> clears L_PTE_YOUNG => write 0 to the pte) in between the other CPU executing
> the swi and the kernel trying to read the immediate. The VMA is fine.

If you mark the instruction was a user-accessing instruction, the kernel
will handle the resulting exception, trying to make the page accessible.
If it is successful, then execution resumes as normal at the faulting
instruction and continues as if nothing happened.

If it can't make the page accessible (eg, out of memory) the exception
handler path (your code above) will be called instead.  Normal action in
that case would be for a system call to return -EFAULT, but in this case
we can't know what the syscall was, so we don't know if userspace will
even pay attention to the returned error code.  In any case, if the page
is no longer accessible, it's going to end up being killed by a SEGV
when we eventually return to userspace anyway.

> > What it means is that the page we were trying to execute has been
> > unmapped beneath us.
> 
> Yes, as a result of the kernel aging it.

No - see above.  The exception path is for more serious conditions than
that.



More information about the linux-arm-kernel mailing list