[RFC PATCH v1 0/4] arm64: Implement stack trace reliability checks

Madhavan T. Venkataraman madvenka at linux.microsoft.com
Mon Apr 5 18:12:08 BST 2021



On 4/5/21 9:56 AM, Madhavan T. Venkataraman wrote:
> 
> 
> On 4/5/21 8:24 AM, Masami Hiramatsu wrote:
>> Hi Madhaven,
>>
>> On Sat, 3 Apr 2021 22:29:12 -0500
>> "Madhavan T. Venkataraman" <madvenka at linux.microsoft.com> wrote:
>>
>>
>>>>> Check for kretprobe
>>>>> ===================
>>>>>
>>>>> For functions with a kretprobe set up, probe code executes on entry
>>>>> to the function and replaces the return address in the stack frame with a
>>>>> kretprobe trampoline. Whenever the function returns, control is
>>>>> transferred to the trampoline. The trampoline eventually returns to the
>>>>> original return address.
>>>>>
>>>>> A stack trace taken while executing in the function (or in functions that
>>>>> get called from the function) will not show the original return address.
>>>>> Similarly, a stack trace taken while executing in the trampoline itself
>>>>> (and functions that get called from the trampoline) will not show the
>>>>> original return address. This means that the caller of the probed function
>>>>> will not show. This makes the stack trace unreliable.
>>>>>
>>>>> Add the kretprobe trampoline to special_functions[].
>>>>>
>>>>> FYI, each task contains a task->kretprobe_instances list that can
>>>>> theoretically be consulted to find the orginal return address. But I am
>>>>> not entirely sure how to safely traverse that list for stack traces
>>>>> not on the current process. So, I have taken the easy way out.
>>>>
>>>> For kretprobes, unwinding from the trampoline or kretprobe handler
>>>> shouldn't be a reliability concern for live patching, for similar
>>>> reasons as above.
>>>>
>>>
>>> Please see previous answer.
>>>
>>>> Otherwise, when unwinding from a blocked task which has
>>>> 'kretprobe_trampoline' on the stack, the unwinder needs a way to get the
>>>> original return address.  Masami has been working on an interface to
>>>> make that possible for x86.  I assume something similar could be done
>>>> for arm64.
>>>>
>>>
>>> OK. Until that is available, this case needs to be addressed.
>>
>> Actually, I've done that on arm64 :) See below patch.
>> (and I also have a similar code for arm32, what I'm considering is how
>> to unify x86/arm/arm64 kretprobe_find_ret_addr(), since those are very
>> similar.)
>>
>> This is applicable on my x86 series v5
>>
>> https://lore.kernel.org/bpf/161676170650.330141.6214727134265514123.stgit@devnote2/
>>
>> Thank you,
>>
>>
> 
> I took a brief look at your changes. Looks reasonable.
> 
> However, for now, I am going to include the kretprobe_trampoline in the special_functions[]
> array until your changes are merged. At that point, it is just a matter of deleting
> kretprobe_trampoline from the special_functions[] array. That is all.
> 
> I hope that is fine with everyone.
> 

Actually, there may still be a problem to solve.

If arch_stack_walk_reliable() is ever called from within kretprobe_trampoline() for debugging or
other purposes after the instance is deleted from the task instance list, it would not be able
to retrieve the original return address.

The stack trace would be unreliable in that case, would it not?

Madhavan




More information about the linux-arm-kernel mailing list