[PATCH v2 2/3] AArch64: KGDB: Add Basic KGDB support

Thu Oct 10 20:24:29 EDT 2013

On Mon, Oct 7, 2013 at 2:51 AM, Will Deacon <will.deacon at arm.com> wrote:
> Hi Andrew,
>
> On Fri, Oct 04, 2013 at 10:51:02PM +0100, Andrew Pinski wrote:
>> On Wed, Oct 2, 2013 at 3:18 AM, Will Deacon <will.deacon at arm.com> wrote:
>> > On Tue, Oct 01, 2013 at 11:53:01AM +0100, Vijay Kilari wrote:
>> >> In fact, the arch_kgdb_breakpoint is called from only one place from
>> >> kernel/debug/debug_core.c and there is no way for the compiler to
>> >> reorder the block
>> >> as it has two write barriers around it:
>> >>         wmb(); /* Sync point before breakpoint */
>> >>         arch_kgdb_breakpoint();
>> >>         wmb(); /* Sync point after breakpoint */
>> >
>> > A write barrier just looks like a memory clobber to the compiler, which
>> > isn't enough to stop this being reordered wrt breakpoints.
>>
>> Yes this will be a full schedule barrier due to the inline-asm being a
>> volatile inline-asm (implicitly because there are no outputs).  What I
>> meant about the compiler could move around code only happens when you
>> don't there are other implicit non schedule barrier instructions, in
>> this case wmb is an volatile inline-asm which is also a scheduler
>> barrier.
>
> volatile != scheduling barrier (example below).

Actually it does mean that but scheduling barrier != code motion
barrier; at least in GCC.

>
>> > In fact, I don't
>> > think is actually solvable with GCC since there is no way to emit a full
>> > scheduling barrier using asm constraints (we'd need an intrinsic I think).
>> >
>> > You could try clobbering every register, but that's filthy, and will likely
>> > cause reload to generate some dreadful spills.
>>
>> Actually this is enough since both inline-asms (the one for wmb and
>> arch_kgdb_breakpoint) are going to be treated as volatile by the
>> compiler so it will not schedule those two inline-asm with respect of
>> each other.
>
> That's true: the asms will not be optimised away and/or moved across each
> other. However, other code *can* move across then, which strikes me as odd
> when we're dealing with breakpoints.
>
> E.g:
>
> 8<---
>
> #define wmb()   __asm__ __volatile__ ("dsb" ::: "memory")
>
> static inline void nop(void)
> {
>         asm ("nop");
> }
>
> int foo(int *bar)
> {
>         int baz, i;
>
>         baz = *bar;
>
>         for (i = 0; i < 10; ++i)
>                 baz++;
>
>         wmb();
>         nop();
>         wmb();
>
>         return baz;
> }
>
> --->8
>
> Assembles to the following with GCC 4.9 for ARM:
>
> 00000000 <foo>:
>    0:   e5900000        ldr     r0, [r0]
>    4:   f57ff04f        dsb     sy
>    8:   e320f000        nop     {0}
>    c:   f57ff04f        dsb     sy
>   10:   e280000a        add     r0, r0, #10
>   14:   e12fff1e        bx      lr

Yes but this is not the scheduler doing the code motion (it is SCEV
Constant prop followed by TER [temporary expression replacement] which
is causing the code motion to happen).

>
> So, whilst loading the parameter hazards against the memory clobber of the
> wmb()s, the loop (optimised to the single add) is now at the end of the
> function. If I wanted a breakpoint after that loop had finished, it's now no
> longer true.
>
>> Note all other targets (ARM, PPC, x86, tile, etc.) all use a static
>> inline function which does exactly the same as what is being added
>> here.
>
> Agreed, I didn't mean to imply this was an issue with this particular patch.
> I'm just curious as to why this has been deemed acceptable by other
> architectures, when it doesn't look especially useful to me.

The main reason why it is acceptable by other architectures is due to
the full definition of the function itself:
void kgdb_breakpoint(void)
{
        atomic_inc(&kgdb_setting_breakpoint);
        wmb(); /* Sync point before breakpoint */
        arch_kgdb_breakpoint();
        wmb(); /* Sync point after breakpoint */
        atomic_dec(&kgdb_setting_breakpoint);
}

All of these function calls are all volatile inline-asm so there is
not going to be any code motion happening.  This is true in all
architectures.  So then you ask if kgdb_breakpoint can be inlined
anywhere that we could get the breakpoint and not have the "correct"
value, the answer is no and I audited each and every call to
kgdb_breakpoint.  If you want just add the attribute noinline to
kgdb_breakpoint and the problem you mentioned about the code motion
happening goes away.

Thanks,
Andrew Pinski