[PATCH] arm64: avoid race condition issue in dump_backtrace

Mark Rutland mark.rutland at arm.com
Wed Mar 28 03:12:41 PDT 2018


On Wed, Mar 28, 2018 at 05:33:32PM +0800, Ji.Zhang wrote:
> On Mon, 2018-03-26 at 12:39 +0100, Mark Rutland wrote:
> > I think that it would be preferable to try to avoid the inifinite loop
> > case. We could hit that by accident if we're tracing a live task.
> > 
> > It's a little tricky to ensure that we don't loop, since we can have
> > traces that span several stacks, e.g. overflow -> irq -> task, so we
> > need to know where the last frame was, and we need to defnie a strict
> > order for stack nesting.
> Can we consider this through an easier way? According to AArch64 PCS,
> stack should be full-descending, which means we can add validation on fp
> by comparing the fp and previous fp, if they are equal means there is an
> exactly loop, while if current fp is smaller than previous means the
> uwnind is rollback, which is also unexpected. The only concern is how to
> handle the unwind from one stack span to another (eg. overflow->irq, or
> irq->task, etc)
> Below diff is a proposal that we check if stack spans, and if yes, a
> tricky is used to bypass the fp check.
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index eb2d151..760ea59 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -101,6 +101,7 @@ void dump_backtrace(struct pt_regs *regs, struct
> task_struct *tsk)
>  {
>         struct stackframe frame;
>         int skip;
> +       unsigned long fp = 0x0;
> 
>         pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
> 
> @@ -127,6 +128,20 @@ void dump_backtrace(struct pt_regs *regs, struct
> task_struct *tsk)
>         skip = !!regs;
>         printk("Call trace:\n");
>         do {
> +               unsigned long stack;
> +               if (fp) {
> +                       if (in_entry_text(frame.pc)) {
> +                               stack = frame.fp - offsetof(struct
> pt_regs, stackframe);
> +
> +                               if (on_accessible_stack(tsk, stack))
> +                                       fp = frame.fp + 0x8; //tricky to
> bypass the fp check
> +                       }
> +                       if (fp <= frame->fp) {
> +                               pr_notice("fp invalid, stop unwind\n");
> +                               break;
> +                       }
> +               }
> +               fp = frame.fp;

I'm very much not keen on this.

I think that if we're going to do this, the only sane way to do it is to
have unwind_frame() verify the current fp against the previous one, and
verify that we have some strict nesting of stacks. Generally, that means
we can go:

  overflow -> irq -> task

... though I'm not sure what to do about the SDEI stack vs the overflow
stack.

Thanks,
Mark.



More information about the Linux-mediatek mailing list