[PATCHv2 2/4] coresight: tmc-etf: Fix NULL ptr dereference in tmc_enable_etf_sink_perf()

Fri Oct 23 09:29:54 EDT 2020

On 10/23/20 2:16 PM, Peter Zijlstra wrote:
> On Fri, Oct 23, 2020 at 01:56:47PM +0100, Suzuki Poulose wrote:
>> On 10/23/20 11:54 AM, Peter Zijlstra wrote:
> 
>>> I think I'm more confused now :-/
>>>
>>> Where do we use ->owner after event creation? The moment you create your
>>> eventN you create the link to sink0. That link either succeeds (same
>>> 'cookie') or fails.
>>
>> The event->sink link is established at creation. At event::add(), we
>> check the sink is free (i.e, it is inactive) or is used by an event
>> of the same session (this is where the owner field *was* required. But
>> this is not needed anymore, as we cache the "owner" read pid in the
>> handle->rb->aux_priv for each event and this is compared against the
>> pid from the handle currently driving the hardware)
> 
> *groan*.. that's going to be a mess with sinks that are shared between
> CPUs :/
> 
>>> I'm also not seeing why exactly we need ->owner in the first place.
>>>
>>> Suppose we make the sink0 device return -EBUSY on open() when it is
>>> active. Then a perf session can open the sink0 device, create perf
>>> events and attach them to the sink0 device using
>>> perf_event_attr::config2. The events will attach to sink0 and increment
>>> its usage count, such that any further open() will fail.
>>
>> Thats where we are diverging. The sink device doesn't have any fops. It
>> is all managed by the coresight driver transparent to the perf tool. All
>> the perf tool does is, specifying which sink to use (btw, we now have
>> automatic sink selection support which gets rid of this, and uses
>> the best possible sink e.g, in case of per-CPU sinks).
> 
> per-CPU sinks sounds a lot better.
> 
> I'm really not convinced it makes sense to do what you do with shared
> sinks though. You'll loose random parts of the execution trace because
> of what the other CPUs do.

The ETM trace protocol has in built TraceID to distinguish the packets
and thus we could decode the trace streams from the shared buffer.
[ But, we don't have buffer overflow interrupts (I am keeping the lid 
closed on that can, for the sake of keeping sanity ;-) ), and thus
any shared session could easily loose data unless we tune the AUX
buffer size to a really large buffer ].

> 
> Full exclusive sink access is far more deterministic.
> 
>>> Once the events are created, the perf tool close()s the sink0 device,
>>> which is now will in-use by the events. No other events can be attached
>>> to it.
>>>
>>> Or are you doing the event->sink mapping every time you do: pmu::add()?
>>> That sounds insane.
>>
>> Sink is already mapped at event create. But yes, the refcount on the
>> sink is managed at start/stop. Thats when we need to make sure that the
>> event being scheduled belongs to the same owner as the one already
>> driving the sink.
> 
> pmu::add() I might hope, because pmu::start() is not allowed to fail.
> 

Right. If we can't get the sink, we simply truncate the buffer.

>> That way another session could use the same sink if it is free. i.e
>>
>> perf record -e cs_etm/@sink0/u --per-thread app1
>>
>> and
>>
>> perf record -e cs_etm/@sink0/u --per-thread app2
>>
>> both can work as long as the sink is not used by the other session.
> 
> Like said above, if sink is shared between CPUs, that's going to be a
> trainwreck :/ Why do you want that?

That ship has sailed. That is how the current generation of systems are,
unfortunately. But as I said, this is changing and there are guidelines
in place to avoid these kind of topologies. With the future
technologies, this will be completely gone.

> 
> And once you have per-CPU sinks like mentioned above, the whole problem
> goes away.

True, until then, this is the best we could do.

Suzuki