[RFC PATCH 1/2] perf: arm_spe: Fix consistency of PMSCR register bit CX

Will Deacon will at kernel.org
Tue Jan 18 02:07:02 PST 2022


On Mon, Jan 17, 2022 at 12:44:31PM +0000, German Gomez wrote:
> The arm_spe_pmu driver will enable the CX bit of the PMSCR register in
> order to add CONTEXT packets into the traces if the owner of the perf
> event runs with capabilities CAP_PERFMON or CAP_SYS_ADMIN.
> 
> The value of the bit is computed in the arm_spe_event_to_pmscr function
> [1], and the check for capabilities happens in [2] in the pmu init
> callback. This suggests that the value of the CX bit should remain
> consistent for the duration of the perf session.
> 
> However, the function arm_spe_event_to_pmscr may be called later during
> the start callback [3] when the "current" process is not the owner of
> the perf session, so the CX bit is currently not consistent. Consider
> the following example:
> 
>   1. Run a process in the background with capability CAP_SYS_ADMIN in CPU0.
> 
>     $ taskset --cpu-list 0 sudo dd if=/dev/random of=/dev/null &
>     [3] 3806
> 
>   2. Begin a perf session _without_ capabilities (we shouldn't see CONTEXT packets).
> 
>     $ perf record -e arm_spe_0// -C0 -- sleep 1
>     $ perf report -D | grep CONTEXT
>     .  0000000e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     .  0000004e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     .  0000008e:  65 df 0e 00 00                                  CONTEXT 0xedf el2
>     [...]
> 
> As can be seen, the traces begin showing CONTEXT packets when the pid is
> 0xedf (3807).

So to be clear: we shouldn't be reporting these packets because 'perf'
doesn't have the right capabilities, but we evaluate that in the context of
'dd' (running as root) and so incorrectly grant the permission. Correct?

> This happens because the pmu start callback is run when
> the current process is not the owner of the perf session, so the CX
> register bit is set.

This doesn't really seem SPE-specific to me -- the perf_allow_*() helpers
also operate implicitly on the current task. How do other PMU drivers avoid
falling into this trap?

> One way to fix this is by caching the value of the CX bit during the
> initialization of the PMU event, so that it remains consistent for the
> duration of the session.

It doesn't feel right to stash this in 'struct arm_spe_pmu' during event
initialisation -- wouldn't that allow perf to continue creating new events
with CX set, even if the paranoid sysctl was changed dynamically? Instead,
I think it would be better if the capabilities were stash in the event
itself somehow at initialisation time.

Will



More information about the linux-arm-kernel mailing list