[PATCH 2/4] coresight: tmc-etr: Track perf handler.

Leo Yan leo.yan at linaro.org
Mon Apr 26 01:25:51 BST 2021


On Fri, Apr 23, 2021 at 05:20:38PM +0800, Leo Yan wrote:
> Hi Daniel,
> 
> On Wed, Apr 21, 2021 at 02:04:11PM +0200, Daniel Kiss wrote:
> 
> [...]
> 
> > diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > index dd19d1d1c3b38..bf9f6311d8663 100644
> > --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> > @@ -1511,6 +1511,12 @@ tmc_update_etr_buffer(struct coresight_device *csdev,
> >  		goto out;
> >  	}
> >  
> > +	/* Serve only the tracer with the right handler */
> > +	if (drvdata->perf_handle != handle) {
> > +		spin_unlock_irqrestore(&drvdata->spinlock, flags);
> > +		goto out;
> > +	}
> > +
> 
> I have concern for this change, Let's use the system-wide tracing as
> an example.
> 
> If a system have 4 CPUs, for the perf with system wide tracing, the
> tool maps the AUX ring buffers for four times, but the CoreSight
> driver only allocates pages once and maps these physical pages for
> four times to user space.  Therefore, the perf tool in the userspace
> manages 4 AUX ring buffers, every AUX ring buffer is served for one
> CPU.
> 
> The confusion between the CoreSight driver (in the kernel) and the
> perf tool (in the userspace) is: there actually has only one ring
> buffer for the enabled sink (let's say ETR), but there have four ring
> buffer control structures, the control structure is
> 'perf_event_mmap_page' which is resident in the first page for perf's
> general ring buffer (please note, this ring buffer is different from
> AUX ring buffer).
> 
> IIUC, this patch only allows the first CPU which enables coresight path
> to update the AUX ring buffer.  This can break the case:
> 
>   - Step 1: perf tool opens ETM event; we can use the command:
> 
>     # perf record -o ${perfdata} -e cs_etm/@tmc_etr0/ -a
>            -- dd if=/dev/zero of=/dev/null
> 
>   - Step 2: the profiled program "dd" is firstly schedued in CPU0, so
>     its "perf_handle" will be assigned to "drvdata->perf_handle";
> 
>   - Step 3: if the program "dd" is migrated to CPU1 and it never runs
>     on CPU0 afterwards, then this patch will prevent to update the AUX
>     ring buffer, due to the "drvdata->perf_handle" cannot match with
>     CPU1's handler.

Want to clarify, this case only happens with "snapshot" mode; With
Mathieu's reminding, "snapshot" mode is quite special: it creates AUX
ring buffer per CPU, but when enable the tracing, if without
specifying the option "-a" for system wide tracing, it only enables
ETM tracer for a CPU when the profiled program is scheduled on that CPU.

To avoid over complexsity, let's give this low priority and firstly
focus on the system-wide tracing case.

Thanks,
Leo

> On the other hand, I think we should change to always stick to the
> same "perf_output_handle" for all CPUs, thus it can allow all CPUs
> to use the same structure 'perf_event_mmap_page' for AUX ring buffer
> management.

> 
> [...]
> 
> Thanks,
> Leo



More information about the linux-arm-kernel mailing list