[PATCH 2/4] coresight: tmc-etr: Track perf handler.
Leo Yan
leo.yan at linaro.org
Mon Apr 26 01:25:51 BST 2021
On Fri, Apr 23, 2021 at 05:20:38PM +0800, Leo Yan wrote:
> Hi Daniel,
>
> On Wed, Apr 21, 2021 at 02:04:11PM +0200, Daniel Kiss wrote:
>
> [...]
>
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index dd19d1d1c3b38..bf9f6311d8663 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -1511,6 +1511,12 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
> > goto out;
> > }
> >
> > + /* Serve only the tracer with the right handler */
> > + if (drvdata->perf_handle != handle) {
> > + spin_unlock_irqrestore(&drvdata->spinlock, flags);
> > + goto out;
> > + }
> > +
>
> I have concern for this change, Let's use the system-wide tracing as
> an example.
>
> If a system have 4 CPUs, for the perf with system wide tracing, the
> tool maps the AUX ring buffers for four times, but the CoreSight
> driver only allocates pages once and maps these physical pages for
> four times to user space. Therefore, the perf tool in the userspace
> manages 4 AUX ring buffers, every AUX ring buffer is served for one
> CPU.
>
> The confusion between the CoreSight driver (in the kernel) and the
> perf tool (in the userspace) is: there actually has only one ring
> buffer for the enabled sink (let's say ETR), but there have four ring
> buffer control structures, the control structure is
> 'perf_event_mmap_page' which is resident in the first page for perf's
> general ring buffer (please note, this ring buffer is different from
> AUX ring buffer).
>
> IIUC, this patch only allows the first CPU which enables coresight path
> to update the AUX ring buffer. This can break the case:
>
> - Step 1: perf tool opens ETM event; we can use the command:
>
> # perf record -o ${perfdata} -e cs_etm/@tmc_etr0/ -a
> -- dd if=/dev/zero of=/dev/null
>
> - Step 2: the profiled program "dd" is firstly schedued in CPU0, so
> its "perf_handle" will be assigned to "drvdata->perf_handle";
>
> - Step 3: if the program "dd" is migrated to CPU1 and it never runs
> on CPU0 afterwards, then this patch will prevent to update the AUX
> ring buffer, due to the "drvdata->perf_handle" cannot match with
> CPU1's handler.
Want to clarify, this case only happens with "snapshot" mode; With
Mathieu's reminding, "snapshot" mode is quite special: it creates AUX
ring buffer per CPU, but when enable the tracing, if without
specifying the option "-a" for system wide tracing, it only enables
ETM tracer for a CPU when the profiled program is scheduled on that CPU.
To avoid over complexsity, let's give this low priority and firstly
focus on the system-wide tracing case.
Thanks,
Leo
> On the other hand, I think we should change to always stick to the
> same "perf_output_handle" for all CPUs, thus it can allow all CPUs
> to use the same structure 'perf_event_mmap_page' for AUX ring buffer
> management.
>
> [...]
>
> Thanks,
> Leo
More information about the linux-arm-kernel
mailing list