[PATCH][V3] arm64: perf: Get the wrong PC value in REGS_ABI_32 mode

Will Deacon will at kernel.org
Tue Jun 23 13:19:10 EDT 2020


On Thu, Jun 18, 2020 at 02:03:32PM +0100, Mark Rutland wrote:
> On Thu, May 28, 2020 at 08:54:19AM +0100, Will Deacon wrote:
> > On Thu, May 28, 2020 at 09:06:07AM +0800, Jiping Ma wrote:
> > > On 05/27/2020 11:19 PM, Mark Rutland wrote:
> > > > On Wed, May 27, 2020 at 09:33:00AM +0800, Jiping Ma wrote:
> > > > > On 05/26/2020 06:26 PM, Mark Rutland wrote:
> > > > > > On Mon, May 11, 2020 at 10:52:07AM +0800, Jiping Ma wrote:
> > > > > This modification can not fix our issue,� we need
> > > > > perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32 to judge if it is 32-bit
> > > > > task or not,
> > > > > then return the correct PC value.
> > > > I must be missing something here.
> > > > 
> > > > The core code perf_reg_abi(task) is called with the task being sampled,
> > > > and the regs are from the task being sampled. For a userspace sample for
> > > > a compat task, compat_user_mode(regs) should be equivalent to the
> > > > is_compat_thread(task_thread_info(task)) check.
> > > > 
> > > > What am I missing?
> > > This issue caused by PC value is not correct. regs are sampled in function
> > > perf_output_sample_regs, that call perf_reg_value(regs, bit) to get PC
> > > value.
> > > PC value is regs[15] in perf_reg_value() function. it should be regs[32].
> > > 
> > > perf_output_sample_regs(struct perf_output_handle *handle,
> > > ����������������������� struct pt_regs *regs, u64 mask)
> > > {
> > > ������� int bit;
> > > ������� DECLARE_BITMAP(_mask, 64);
> > > 
> > > ������� bitmap_from_u64(_mask, mask);
> > > ������� for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) {
> > > ��������������� u64 val;
> > > 
> > > ��������������� val = perf_reg_value(regs, bit);
> > > ��������������� perf_output_put(handle, val);
> > > ������� }

wtf happened here?!

> > Yes, but Mark's point is that checking 'compat_user_mode(regs)' should be
> > exactly the same as checking 'perf_reg_abi(current) == PERF_SAMPLE_REGS_ABI_32'.
> > Are you saying that's not the case? If so, please can you provide an example
> > of when they are different?
> > 
> > Leaving that aside for a second, I also think it's reasonable to question
> > whether this whole interface is busted or not. I looked at it last night but
> > struggled to work out what it's supposed to do. Consider these three
> > scenarios, all under an arm64 kernel:
> > 
> >   1. 64-bit perf + 64-bit application being profiled
> >   2. 64-bit perf + 32-bit application being profiled
> >   3. 32-bit perf + 32-bit application being profiled
> > 
> > It looks like the current code is a bodge to try to handle both (2) and
> > (3) at the same time:
> > 
> >   - In case (3), userspace only asks about registers 0-15
> >   - In case (2), we fudge the higher registers so that 64-bit SP and LR
> >     hold the 32-bit values as a bodge to allow a 64-bit dwarf unwinder
> >     to unwind the stack
> 
> I think the fudging is nonsensical to begin with, as I would have
> expected that PERF_REGS_ABI_32 should be the same layout regardless of
> consumer (and therefore should be identical to the 32-bit arm native
> format). I realise that doesn't change that we might be stuck with it...

Thankfully, I don't think the fudging is affected by the patch here, so I'm
inclined to take it as a fix. In fact, with this change, doesn't the prefix
of the regs returned in (2) look the same as the one you would expect in
32-bit arm?

> > So the idea behind the patch looks fine because case (3) is expecting the PC
> > in register 15 and instead gets 0, but the temptation is to clean this up so
> > that cases (2) and (3) report the same data to userspace (along the lines of
> > Mark's patch), namely only the first 16 registers with the PC moved down. We
> > can only do that if the unwinder is happy, which it might be if it only ever
> > looks up dwarf register numbers based on the unwind tables in the binary.
> > Somebody would need to dig into that.
> 
> Agreed; I will try to figure out what the perf tool does in the three
> cases above. I would be grateful if others could take a look too.
> 
> Another slightly scary thought: what happens for a 32-bit perf with a
> 64-bit application being profiled? I don't see how that'd be forbidden,
> but I also don't see how it'd work.

Although you can obviously feed a 32-bit perf with a 64-bit perf.data,
a 32-bit task cannot spawn a 64-bit child so that rules out a lot of the
runtime weirdness, no?

> > Otherwise, if it generates unconditional references to things like
> > register 30 to grab the link register, then we're stuck with the bodge
> > and need to special-case the PC.
> 
> I agree that in that case we'd have to keep the existing bodge, and we'd
> have to special-case the PC, but I'd prefer to split the logic for case
> 1 into a separate function for cases 2 and 3 so that we can more easily
> avoid getting this more confused.
> 
> Let's figure out what userspace does first...

afaict, perf just slurps all the registers in one go and hands them off
to the unwinder, which will index into them according to the dwarf unwind
tables. I have doubts as to how well that works (for example, unwinder
address corrections due to Thumb would not be applied).

So, I think we should take this patch (which puts the PC where you'd expect
to find it for compat tasks) and then we could consider removing the current
lr/sp fudging as a separate patch, which we could revert if it causes a
problem. However, I'm not sure I want to open that up.

What do you think?

Will



More information about the linux-arm-kernel mailing list