[PATCH v14 7/8] signal: define the field siginfo.si_faultflags

Mon Nov 9 20:54:41 EST 2020

Peter you are patching buggy code for your siginfo extension can
you please ignore vas-fault.c.  The code in vas-fault.c should
be fixed separately.  Futher it uses clear_siginfo so you should
get well defined behavior even if your new field is not initialized.

I have copied the powerpc folks so hopefully this buggy code
can be fixed.

> diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
> index 3d21fce254b7..877e7d5fb4a2 100644
> --- a/arch/powerpc/platforms/powernv/vas-fault.c
> +++ b/arch/powerpc/platforms/powernv/vas-fault.c
> @@ -154,6 +154,7 @@ static void update_csb(struct vas_window *window,
>  	info.si_errno = EFAULT;
>  	info.si_code = SEGV_MAPERR;
>  	info.si_addr = csb_addr;
> +	info.si_faultflags = 0;

Powerpc folks.  This code was introduced in c96c4436aba4 ("powerpc/vas:
Update CSB and notify process for fault CRBs") and is badly buggy.

Let me count the bugs:

a) Using kill_pid_info.  That performs a permission check that
   does not make sense from a kernel thread.

b) Manually filling in struct siginfo.  Everyone gets it wrong
   and the powerpc code is no exception setting si_errno when
   that is something Linux as a rule does not do.

Technically we have send_sig_fault to handle sending
a fault from a non-sychrnous context but I am not convinced
it make sense in this case.

c) Sending an asynchronous SIGSEGV with the si_code set to SEGV_MAPERR.
   How can userspace detect it is an asynchronous signal?  What can
   userspace do if it detects an asynchronous signal?  If userspace is
   so buggered as to give your kernel thread a bogus address I suspect
   uncerimonious sending SIGKILL is probably the best you can do.

There are some additional questionable things in that code like taking a
task_struct reference simply to be able to test tsk->flags but no
locks are held to ensure that tsk->flags are meaningful.  Nor are
any tests performed to see if the task being tested still uses
the designated mm.  I suspect exec could have been called.

In which case the code needs to check the mm, or at least play with
exec_id to ensure you are not improperly signaling a process after exec.

None of this is to say that update_csb is fundmentally bad or hard to
correct just that it has some significant defects in it's implementation
right now that need to be corrected.  I am hoping a detailed accounting
and pointing out those defects will allow the bug to be fixed.

Thank you,
Eric