ARM Ftrace Function Graph Fails With UNWINDER_FRAME_POINTER

Justin Chen justin.chen at broadcom.com
Sat Dec 2 00:49:07 PST 2023



On 12/1/23 10:53 PM, Ard Biesheuvel wrote:
> On Fri, 1 Dec 2023 at 23:59, Justin Chen <justin.chen at broadcom.com> wrote:
>>
>>
>>
>> On 12/1/23 10:07 AM, Steven Rostedt wrote:
>>> On Fri, 1 Dec 2023 09:25:59 -0800
>>> Justin Chen <justin.chen at broadcom.com> wrote:
>>>
>>>>> It appears the sub instruction at 0x6dd0 correctly accounts for the
>>>>> extra 8 bytes, so the frame pointer is valid. So it is our assumption
>>>>> that there are no gaps between the stack frames is invalid.
>>>>
>>>> Thanks for the assistance. The gap between the stack frame depends on
>>>> the function. Most do not have a gap. Some have 8 (as shown above), some
>>>> have 12. A single assumption here is not going to work. I'm having a
>>>> hard time finding out the reasoning for this gap. I tried disabling a
>>>> bunch of gcc flags as well as -O2 and the gap still exists.
>>>
>>> That code was originally added because of some strange things that gcc did
>>> with mcount (for example, it made a copy of the stack frame that it passed
>>> to mcount, where the function graph tracer replaced the copy of the return
>>> stack making the shadow stack go out of sync and crash). This was very hard
>>> to debug and I added this code to detect it if it happened again.
>>>
>>> Well it's been over a decade since that happened (2009).
>>>
>>>     71e308a239c09 ("function-graph: add stack frame test")
>>>
>>> I'm happy assuming that the compiler folks are aware of our tricks with
>>> hijacking return calls and I don't expect it to happen again. We can just
>>> rip out those checks. That is, if it's only causing false positives, I
>>> don't think it's worth keeping around.
>>>
>>> Has it detected any real issues on the Arm platforms?
>>>
>>> -- Steve
>>
>> I am not familiar enough to make a call. But from my limited testing
>> with ARM, I didn't see any issues. If you would like me to, I can submit
>> a patch to remove the check entirely. Or maybe only disable it for ARM?
>>
> 
> Please try the fix I proposed first.

Just tested it. Seems to do the trick. Either solution works for me.

FWIW I also experimented with LLVM, looks like function_graph just 
crashes regardless of the issue being discussed. The disassemble of 
LLVM[1] does something completely different.

Thanks,
Justin

[1]
LLVM dump
c0c6faa0 <sk_getsockopt>:
c0c6faa0: f0 4f 2d e9   push    {r4, r5, r6, r7, r8, r9, r10, r11, lr}
c0c6faa4: 1c b0 8d e2   add     r11, sp, #28
c0c6faa8: ac d0 4d e2   sub     sp, sp, #172
c0c6faac: 00 70 a0 e1   mov     r7, r0
c0c6fab0: c8 0c 04 e3   movw    r0, #19656
c0c6fab4: 80 02 4c e3   movt    r0, #49792
c0c6fab8: 03 50 a0 e1   mov     r5, r3
c0c6fabc: 00 00 90 e5   ldr     r0, [r0]
c0c6fac0: 02 a0 a0 e1   mov     r10, r2
c0c6fac4: 20 00 0b e5   str     r0, [r11, #-32]
c0c6fac8: 00 40 2d e9   stmdb   sp!, {lr}
c0c6facc: 4b 8b d6 eb   bl      0xc0212800 <__gnu_mcount_nc> @ imm = 
#-10867412
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4206 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20231202/fe7cc9bf/attachment.p7s>


More information about the linux-arm-kernel mailing list