[PATCH v8] arm64: Expose FAR_EL1 tag bits in siginfo

Wed Jun 24 12:51:49 EDT 2020

On Wed, Jun 24, 2020 at 2:28 AM Dave Martin <Dave.Martin at arm.com> wrote:
>
> On Tue, Jun 23, 2020 at 05:40:08PM -0700, Peter Collingbourne wrote:
> > On Tue, Jun 23, 2020 at 10:52 AM Eric W. Biederman
> > <ebiederm at xmission.com> wrote:
> > >
> > > Dave Martin <Dave.Martin at arm.com> writes:
> > >
> > > > On Tue, Jun 23, 2020 at 07:54:59AM -0500, Eric W. Biederman wrote:
> > > >> Peter Collingbourne <pcc at google.com> writes:
> > > >>
> > > >> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > >> > index 47f651df781c..a8380a2b6361 100644
> > > >> > --- a/arch/arm64/kernel/traps.c
> > > >> > +++ b/arch/arm64/kernel/traps.c
> > > >> > @@ -235,20 +235,41 @@ static void arm64_show_signal(int signo, const char *str)
> > > >> >  }
> > > >> >
> > > >> >  void arm64_force_sig_fault(int signo, int code, void __user *addr,
> > > >> > +                     unsigned long far, unsigned char far_tb_mask,
> > > >> >                       const char *str)
> > > >> >  {
> > > >> >    arm64_show_signal(signo, str);
> > > >> > -  if (signo == SIGKILL)
> > > >> > +  if (signo == SIGKILL) {
> > > >> >            force_sig(SIGKILL);
> > > >> > -  else
> > > >> > -          force_sig_fault(signo, code, addr);
> > > >> > +  } else {
> > > >> > +          struct kernel_siginfo info;
> > > >> > +          clear_siginfo(&info);
> > > >> > +          info.si_signo = signo;
> > > >> > +          info.si_errno = 0;
> > > >> > +          info.si_code = code;
> > > >> > +          info.si_addr = addr;
> > > >> > +          info.si_addr_top_byte = (far >> 56) & far_tb_mask;
> > > >> > +          info.si_addr_top_byte_mask = far_tb_mask;
> > > >> > +          force_sig_info(&info);
> > > >> > +  }
> > > >> >  }
> > > >> >
> > > >> >  void arm64_force_sig_mceerr(int code, void __user *addr, short lsb,
> > > >> > -                      const char *str)
> > > >> > +                      unsigned long far, const char *str)
> > > >> >  {
> > > >> > +  struct kernel_siginfo info;
> > > >> > +
> > > >> >    arm64_show_signal(SIGBUS, str);
> > > >> > -  force_sig_mceerr(code, addr, lsb);
> > > >> > +
> > > >> > +  clear_siginfo(&info);
> > > >> > +  info.si_signo = SIGBUS;
> > > >> > +  info.si_errno = 0;
> > > >> > +  info.si_code = code;
> > > >> > +  info.si_addr = addr;
> > > >> > +  info.si_addr_lsb = lsb;
> > > >> > +  info.si_addr_top_byte = far >> 56;
> > > >> > +  info.si_addr_top_byte_mask = 0xff;
> > > >> > +  force_sig_info(&info);
> > > >> >  }
> > > >>
> > > >> I have a real problem with this construction.  force_sig_info is not an
> > > >> interface that should be used for anything except to define a wrapper
> > > >> that takes it's parameters.
> > > >
> > > > Can you elaborate?  How would you do this king of thing.
> > >
> > > There are no other uses of force_sig_info in architecture code.
> > >
> > > I just removed them _all_ because they were almost all broken.
> > > In fact your mcerr case is broken because it uses two different
> > > union members simultantiously.
> >
> > Is that really broken? I thought that the Linux kernel deliberately
> > didn't care about strict aliasing rules (the top-level Makefile passes
> > -fno-strict-aliasing) so I thought that it was valid in "Linux kernel
> > C" even though from a standards point of view it is invalid. (That
> > being said, this is probably moot with my proposed changes below
> > though.)
>
> I have a feeling that -fno-strict-aliasing only allows you to _read_ a
> different union member from the one previously written.
>
> Writing a different member from the last one written can still splatter
> on the other members IIUC.
>
> It would be better to keep things separate rather than risk
> incorrectness just to save a few bytes.
>
> IMHO -fno-strict-aliasing is no excuse for gratuitous type-punning.
>
> > > So I am looking for something like force_sig_mcerr or force_sig_fault
> > > that includes your new information that then calls force_sig_info.
> > >
> > > I know of no other way to safely use the siginfo struct.
> >
> > So you want something like:
> >
> > int force_sig_fault_with_ignored_bits(int signo, int code, void __user
> > *addr, uintptr_t addr_ignored, uintptr_t addr_ignored_mask);
> > int force_sig_mceerr_with_ignored_bits(int code, void __user *addr,
> > short lsb, uintptr_t addr_ignored, uintptr_t addr_ignored_mask);
> >
> > in kernel/signal.c and the code in arch/arm64 would call that?
> >
> > > > AIUI we absolutely need a forced signal here, we need to supply
> > > > metadata, and we don't have to open-code all that at every relevant
> > > > signal generation site...
> > > >
> > > >> It is not clear to me that if you have adapted siginfo_layout.
> > > >
> > > > Garbled sentence?
> > >
> > > Looks like.  One of the pieces of code that needs to change
> > > when siginfo gets updated is siginfo_layout so that the structure
> > > can be properly decoded and made sense of.
> > >
> > > I am not seeing anything like that.
> >
> > Okay, this has to do with copying between the compat and non-compat
> > versions of the struct? Sure, I can update that, although the code
> > would be basically non-functional on arm64 because TBI isn't supported
> > on 32-bit ARM.
> >
> > > >> > diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
> > > >> > index cb3d6c267181..6dd82373eb2d 100644
> > > >> > --- a/include/uapi/asm-generic/siginfo.h
> > > >> > +++ b/include/uapi/asm-generic/siginfo.h
> > > >> > @@ -91,6 +91,14 @@ union __sifields {
> > > >> >                            char _dummy_pkey[__ADDR_BND_PKEY_PAD];
> > > >> >                            __u32 _pkey;
> > > >> >                    } _addr_pkey;
> > > >> > +#ifdef __aarch64__
> > > >> > +                  /* used with all si_codes */
> > > >> > +                  struct {
> > > >> > +                          short _dummy_top_byte;
> > > >
> > > > ^ What's this for?  I don't have Eric's insight here.
> >
> > We would need a short's worth of padding in order to prevent the
> > fields from occupying the same address as si_addr_lsb.
> >
> > > >
> > > >> > +                          unsigned char _top_byte;
> > > >> > +                          unsigned char _top_byte_mask;
> > > >> > +                  } _addr_top_byte;
> > > >> > +#endif
> > > >> >            };
> > > >> >    } _sigfault;
> > > >> >
> > > >>
> > > >> Why the _dummy_top_byte?  Oh I see it should be spelled "short _addr_lsb;".
> > > >>
> > > >> Please remove the "#ifdef __aarch64__".  If at all possible we want to
> > > >> design this so any other architecture who has this challenge can use the
> > > >> code.  The kind of code does not get enough attention/maintenance if it
> > > >> is built for a single architecture.
> >
> > Seems reasonable. I was recently made aware that RISC-V was
> > considering a similar feature:
> > https://lists.riscv.org/g/tech-tee/topic/risc_v_tbi_proposal/72855478
> > I would have opted to expand this to other architectures on an
> > as-needed basis, but I'd also be fine with having it on all
> > architectures from the start.
> >
> > If we make this arch-independent, we have an additional concern, which
> > is "what if some future architecture wants more than one byte here?"
> > For example, an architecture may have a "top-two-bytes-ignore"
> > feature, which would imply two-byte (misnamed) "si_addr_top_byte" and
> > "si_addr_top_byte_mask" fields. And the RISC-V proposal potentially
> > implies many more ignored bits (see slide 13 of the presentation). The
> > maximum size that these fields can possibly be is the size of a
> > pointer, and with that there wouldn't be enough room in the padding at
> > this point to accommodate the new fields.
> >
> > That basically implies your earlier suggestion of adding a union
> > member here to accommodate future expansion of the union, and adding
> > the new fields after the union. I'm happy to make that change, with
> > the fields renamed "si_addr_ignored" and "si_addr_ignored_mask".
>
> I think what we need here is basically a flags word.
>
> So long as we keep a flag spare to indicate the existence of a further
> flags word, we can extend as needed.
>
> How the existence of the first flags words is detected is another
> problem.  If it only applies for newly-defined si_code values, then
> I guess si_code may be sufficient.

Existing kernels will zero-initialize unused regions of the siginfo
data structure. The zero-initialization of the padding at the end of
the struct is done by the clear_user call here:
https://github.com/torvalds/linux/blob/3e08a95294a4fb3702bb3d35ed08028433c37fe6/kernel/signal.c#L3193

and the zero-initialization of the padding between fields and unused
union members is done by the clear_siginfo function which the kernel
calls when initializing the data structure:
https://github.com/torvalds/linux/blob/3e08a95294a4fb3702bb3d35ed08028433c37fe6/include/linux/signal.h#L20

Therefore, a flag word value of 0 may be used to detect a lack of
support for flagged fields.

That being said, in this particular case, we do not need a flag word.
We can just take advantage of this zero-initialization behavior in
existing kernels to set si_addr_ignored_mask to 0, which indicates
that none of the bits in si_addr_ignored are valid.

Peter

> > > >
> > > > Does this belong in the user-facing siginfo?  It seems a bit strange,
> > > > when other closely-related information such as esr_context is in the
> > > > arch-specific signal frame.
> > > >
> > > >
> > > > If trying to make this reusable, I wonder if we should have some sort of
> > > > "address attributes" field.
> > > >
> > > > An alternative approach would be to add some opaque "arch_data" field,
> > > > that the arch code can go look at when delivering the signal.
> > >
> > > My point is arch specific hacks don't get looked at, and wind up being
> > > broken.  So I am not encouraging anything that doesn't get looked at,
> > > and winds up being broken.
>
> Arch code will get looked at, and is automatically inherently broken.
> Nor is the core code always perfect...
>
> I agree that generic is best, both for getting more eyes on it and for
> coming up with a clean design, but there's also a risk of pointless
> over-abstraction for things that just aren't generic enough.
>
> Part of the issue is that each arch necessarily has its own way of
> dumping its register state, while siginfo contains abstract diagnostic
> information.  The boundary between these two is not clear-cut: for
> example, arm64 dumps its exception syndrome register which contains
> (among other things) imformation about whether a faulted access was a
> read or write.  Is this generic information, or arch-specific
> information?
>
>
> A side problem is that siginfo_t as originally designed is quite hard to
> extend.
>
> AFAICT, any extension needs a new si_code, otherwise there is no way
> to detect that the extension fields are present.  This is fine for
> defining entirely new signal types, but seems to make it hard to add
> supplementary information for existing signals. Have I missed something
> here?
>
> Say we wanted to add extra data for SIGSEGV to indicate the size of
> access and whether it was a read or write.  If we try to add a new
> si_code for this, then all software that inspects si_code at all for
> SIGSEGV now has no idea what to do with this new si_code.
>
> Reading between the lines, I wonder whether this is part of the reason
> arches tend to go their own way:  such information can't be added
> generically precisely because it _is_ generic -- too generic to justify
> a new si_code.  If so, this problem is going to crop up again and
> again...
>
> > > > I think that's all we were trying to achieve here: tack some arch
> > > > private data onto the signal, to avoid having to stash the same info in
> > > > thread_info and pray that it doesn't get clobbered in between signal
> > > > generation and delivery.
> > >
> > > What makes it arch private data?  Why isn't it just data that your arch
> > > happens to have that other architectures don't yet.
>
> I didn't mean it must be private, just that it can be.
>
> > > > At signal delivery time, the arch signal delivery code could inspect
> > > > this data and emit it into the signal frame as appropriate for the
> > > > arch.
> > >
> > > Sorry this probably isn't what you mean but when I read that description
> > > I get the feeling that you are asking for code that won't be reviewed or
> > > looked at by anyone else.  So inevitably that code will be broken.
> > > Frankly it is bad enough finding people to review and maintain the
> > > generic code of the kernel.
>
> Does this need flag up to the arch maintainers?  Signal code has been
> heavily arch-specific for ages, and that's where the force of gravity
> seems to point.  I a lot of work has gone into cleaning this up, but it
> sounds like arch maintainers might need to push back harder on anything
> that _could_ be done in the common code.
>
> > > With that said, and your desire for this data to go into the sigframe
> > > (despite it sounding a lot like generic data that only aarch64 has
> > > implemented yet) can you remind me why siginfo comes into the equation
> > > at all?
> > >
> > > Last I remember the discussion there were some issues and the plan was
> > > to simply solve the problem generically and use siginfo, and there would
> > > not need to be any sigframe changes.
> > >
> > > But if you want to deliver via sigframe force_sig_info and all it's
> > > variants will be delivered when the kernel returns back to userspace.
> > > So there should be no need to touch siginfo or anything else in that
> > > scenario.
> >
> > My understanding is that siginfo should contain information about the
> > signal itself, while sigcontext should contain any information about
> > the machine state at the point when the signal was delivered that is
> > needed in order to restore the state after returning from a signal
> > handler. The fault address isn't really part of the restorable machine
> > state (despite the existence of a "fault_address" field in
> > sigcontext), so any information relating to it belongs (at least
> > morally) in siginfo.
>
> I think this is more than just a principle.
>
> Diagnostic information that is supposed to accompany a signal needs to
> be captured at the time the signal is generated, otherwise it may have
> been overwritten by another signal-generating event by the time the
> first signal is actually delivered.
>
> Currently this isn't handled properly in the arm64 code, so it looks
> like some diagnostic fields in the arm64 signal frame can be wrong in
> some situations.  (I know, that's your "non-generic code that hardly
> anyone relies on will be broken" argument.  But the need to keep
> diagnostic information with the signal instance it relates to feels like
> a generic problem.)
>
> I have no objection to finding a generic way to report the address tag
> information, but "address tag" is not the most generic concept in the
> world, even if there are a few arches with something analogous.
>
> Cheers
> ---Dave