[PATCH v4 09/17] perf/core: Use static_call to optimize perf_guest_info_callbacks
Kees Cook
keescook at chromium.org
Sun Feb 6 10:45:15 PST 2022
On Fri, Feb 04, 2022 at 09:35:49AM -0800, Sami Tolvanen wrote:
> On Wed, Feb 2, 2022 at 10:43 AM Sean Christopherson <seanjc at google.com> wrote:
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state);
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);
> >
> > Using __static_call_return0() makes clang's CFI sad on arm64 due to the resulting
> > function prototype mistmatch, which IIUC, is verified by clang's __cfi_check()
> > for indirect calls, i.e. architectures without CONFIG_HAVE_STATIC_CALL.
> >
> > We could fudge around the issue by using stubs, massaging prototypes, etc..., but
> > that means doing that for every arch-agnostic user of __static_call_return0().
> >
> > Any clever ideas? Can we do something like generate a unique function for every
> > DEFINE_STATIC_CALL_RET0 for CONFIG_HAVE_STATIC_CALL=n, e.g. using typeof() to
> > get the prototype?
>
> I'm not sure there's a clever fix for this. On architectures without
> HAVE_STATIC_CALL, this is an indirect call to a function with a
> mismatching type, which CFI is intended to catch.
>
> The obvious way to solve the problem would be to use a stub function
> with the correct type, which I agree, isn't going to scale. You can
> alternatively check if .func points to __static_call_return0 and not
> make the indirect call if it does. If neither of these options are
> feasible, you can disable CFI checking in the functions that have
> these static calls using the __nocfi attribute.
>
> Kees, any thoughts?
I'm digging through the macros to sort this out, but IIUC, an example of
the problem is:
perf_guest_cbs->handle_intel_pt_intr is:
unsigned int (*handle_intel_pt_intr)(void);
The declaration for this starts with:
DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);
which produces:
extern struct static_call_key STATIC_CALL_KEY(__perf_guest_handle_intel_pt_intr);
extern typeof(*perf_guest_cbs->handle_intel_pt_intr) STATIC_CALL_TRAMP(__perf_guest_handle_intel_pt_intr);
and the last line becomes:
extern unsigned int (*__SCT____perf_guest_handle_intel_pt_intr)(void);
with !HAVE_STATIC_CALL, when perf_guest_handle_intel_pt_intr() does:
return static_call(__perf_guest_handle_intel_pt_intr)();
it is resolving static_call() into:
((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func))
so the caller is expecting "unsigned int (*)(void)" but the prototype
of __static_call_return0 is "long (*)(void)":
long __static_call_return0(void);
Could we simply declare a type-matched ret0 trampoline too?
#define STATIC_CALL_TRAMP_RET0_PREFIX __SCT__ret0__
#define STATIC_CALL_TRAMP_RET0(name) __PASTE(STATIC_CALL_TRAMP_RET0_PREFIX, name)
#define DEFINE_STATIC_CALL_RET0(name, _func) \
static typeof(_func) STATIC_CALL_TRAMP_RET0(name) { return 0; } \
__DEFINE_STATIC_CALL(name, _func, STATIC_CALL_TRAMP_RET0(name));
static_call_update(__perf_guest_handle_intel_pt_intr,
(void *)&STATIC_CALL_TRAMP_RET0(__perf_guest_handle_intel_pt_intr))
--
Kees Cook
More information about the linux-arm-kernel
mailing list