[PATCH v11 4/5] arm64: Introduce stack trace reliability checks in the unwinder

Madhavan T. Venkataraman madvenka at linux.microsoft.com
Fri Nov 26 09:23:47 PST 2021



On 11/26/21 7:29 AM, Mark Brown wrote:
> On Thu, Nov 25, 2021 at 10:59:27AM -0600, Madhavan T. Venkataraman wrote:
>> On 11/25/21 8:56 AM, Mark Brown wrote:
>>> On Tue, Nov 23, 2021 at 01:37:22PM -0600, madvenka at linux.microsoft.com wrote:
> 
>>> Probably also worth noting that this doesn't select
>>> HAVE_RELIABLE_STACKTRACE which is what any actual users are going to use
>>> to identify if the architecture has the feature.  I would have been
>>> tempted to add arch_stack_walk() as a separate patch but equally having
>>> the user code there (even if it itself can't yet be used...) helps with
>>> reviewing the actual unwinder so I don't mind.
> 
>> I did not select HAVE_RELIABLE_STACKTRACE just in case we think that some
>> more reliability checks need to be added. But if reviewers agree
>> that this patch series contains all the reliability checks we need, I
>> will add a patch to select HAVE_RELIABLE_STACKTRACE to the series.
> 
> I agree that more checks probably need to be added, might be worth
> throwing that patch into the end of the series though to provide a place
> to discuss what exactly we need.  My main thought here was that it's
> worth explicitly highlighting in this change that the Kconfig bit isn't
> glued up here so reviewers notice that's what's happening.
> 

OK. I will add the patch to the next version.

>>>> +static void unwind_check_reliability(struct task_struct *task,
>>>> +				     struct stackframe *frame)
>>>> +{
>>>> +	if (frame->fp == (unsigned long)task_pt_regs(task)->stackframe) {
>>>> +		/* Final frame; no more unwind, no need to check reliability */
>>>> +		return;
>>>> +	}
> 
>>> If the unwinder carries on for some reason (the code for that is
>>> elsewhere and may be updated separately...) then this will start
>>> checking again.  I'm not sure if this is a *problem* as such but the
>>> thing about this being the final frame coupled with not actually
>>> explicitly stopping the unwind here makes me think this should at least
>>> be clearer, the comment begs the question about what happens if
>>> something decides it is not in fact the final frame.
> 
>> I can address this by adding an explicit comment to that effect.
>> For example, define a separate function to check for the final frame:
> 
>> /*
>>  * Check if this is the final frame. Unwind must stop at the final
>>  * frame.
>>  */
>> static inline bool unwind_is_final_frame(struct task_struct *task,
>>                                          struct stackframe *frame)
>> {
>> 	return frame->fp == (unsigned long)task_pt_regs(task)->stackframe;
>> }
> 
>> Then, use this function in unwind_check_reliability() and unwind_continue().
> 
>> Is this acceptable?
> 
> Yes, I think that should address the issue - I'd have to see it in
> context to be sure but it does make it clear that the same check is
> being done which was the main thing.
> 

OK. I will make the above changes in the next version.

Thanks.

Madhavan



More information about the linux-arm-kernel mailing list