[RFC v2 PATCH 1/1] ARM64: KGDB: Add FP/SIMD debug support

Vijay Kilari vijay.kilari at gmail.com
Fri Feb 28 06:08:58 EST 2014


Hi Will,

On Fri, Nov 29, 2013 at 3:27 PM, Vijay Kilari <vijay.kilari at gmail.com> wrote:
> On Fri, Nov 29, 2013 at 2:57 PM, Ard Biesheuvel
> <ard.biesheuvel at linaro.org> wrote:
>> On 29 November 2013 10:20, Vijay Kilari <vijay.kilari at gmail.com> wrote:
>>> On Fri, Nov 29, 2013 at 2:23 PM, Ard Biesheuvel
>>> <ard.biesheuvel at linaro.org> wrote:
>>>> On 18 November 2013 13:06,  <vijay.kilari at gmail.com> wrote:
>>>>> From: Vijaya Kumar K <Vijaya.Kumar at caviumnetworks.com>
>>>>>
>>>>> Add KGDB debug support for FP/SIMD processor.This support
>>>>> only debugging of FP/SIMD in kernel mode.
>>>>> With this patch one can view or set FP/SIMD registers in
>>>>> kernel context
>>>>>
>>>>> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar at caviumnetworks.com>
>>>>> ---
>>>>>  arch/arm64/include/asm/fpsimd.h |    1 +
>>>>>  arch/arm64/kernel/fpsimd.c      |    5 ++
>>>>>  arch/arm64/kernel/kgdb.c        |  105 +++++++++++++++++++++++++--------------
>>>>>  3 files changed, 73 insertions(+), 38 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
>>>>> index 609bc44..5039d6d 100644
>>>>> --- a/arch/arm64/include/asm/fpsimd.h
>>>>> +++ b/arch/arm64/include/asm/fpsimd.h
>>>>> @@ -60,6 +60,7 @@ extern void fpsimd_load_state(struct fpsimd_state *state);
>>>>>  extern void fpsimd_thread_switch(struct task_struct *next);
>>>>>  extern void fpsimd_flush_thread(void);
>>>>>  extern void fpsimd_reload_fpstate(void);
>>>>> +extern int  fpsimd_thread_state(void);
>>>>>
>>>>>  #endif
>>>>>
>>>>> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>>>>> index 5b13c17..27604c1 100644
>>>>> --- a/arch/arm64/kernel/fpsimd.c
>>>>> +++ b/arch/arm64/kernel/fpsimd.c
>>>>> @@ -124,6 +124,11 @@ void fpsimd_flush_thread(void)
>>>>>         set_thread_flag(TIF_FOREIGN_FPSTATE);
>>>>>  }
>>>>>
>>>>> +int fpsimd_thread_state(void)
>>>>
>>>> Please use a meaningful name for the function. Currently, it returns
>>>> true iff 'current' is a non-kernel thread whose FPSIMD state is /not/
>>>> present in the FPSIMD register file. You should choose a name that
>>>> reflects that purpose.
>>>>
>>>  do you have any reason why only TIF_FOREIGN_FPSTATE flag is
>>> set only for non-kernel threads in kernel_neon_begin()?
>>>
>>
>> The purpose of the TIF_FOREIGN_FPSTATE flag is to convey that the
>> FPSIMD context should be restored before a task enters userland. A
>> kernel thread never enters userland by definition so the flag has no
>> meaning there.
>>
>>>>> +{
>>>>> +       return (current->mm && test_thread_flag(TIF_FOREIGN_FPSTATE)) ? 1 : 0;
>>>>> +}
>>>>> +
>>>>
>>>> As Will pointed out, the expression before ? already evaluates to 1 or
>>>> 0, so appending '? 1 : 0' is redundant.
>>>>
>>>>>  void fpsimd_reload_fpstate(void)
>>>>>  {
>>>>>         preempt_disable();
>>>>> diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
>>>>> index f10f2ba..5e34852 100644
>>>>> --- a/arch/arm64/kernel/kgdb.c
>>>>> +++ b/arch/arm64/kernel/kgdb.c
>>>>> @@ -23,6 +23,14 @@
>>>>>  #include <linux/kdebug.h>
>>>>>  #include <linux/kgdb.h>
>>>>>  #include <asm/traps.h>
>>>>> +#include <asm/fpsimd.h>
>>>>> +
>>>>> +#ifdef CONFIG_KERNEL_MODE_NEON
>>>>> +/*
>>>>> + * Structure to hold FP/SIMD register contents
>>>>> + */
>>>>> +struct fpsimd_state kgdb_fpsimd_regs;
>>>>
>>>> 'static' if it's local to this source file
>>>>
>>>    OK
>>>>> +#endif
>>>>>
>>>>>  struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
>>>>>         { "x0", 8, offsetof(struct pt_regs, regs[0])},
>>>>> @@ -59,40 +67,40 @@ struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
>>>>>         { "sp", 8, offsetof(struct pt_regs, sp)},
>>>>>         { "pc", 8, offsetof(struct pt_regs, pc)},
>>>>>         { "pstate", 4, offsetof(struct pt_regs, pstate)},
>>>>> -       { "v0", 16, -1 },
>>>>> -       { "v1", 16, -1 },
>>>>> -       { "v2", 16, -1 },
>>>>> -       { "v3", 16, -1 },
>>>>> -       { "v4", 16, -1 },
>>>>> -       { "v5", 16, -1 },
>>>>> -       { "v6", 16, -1 },
>>>>> -       { "v7", 16, -1 },
>>>>> -       { "v8", 16, -1 },
>>>>> -       { "v9", 16, -1 },
>>>>> -       { "v10", 16, -1 },
>>>>> -       { "v11", 16, -1 },
>>>>> -       { "v12", 16, -1 },
>>>>> -       { "v13", 16, -1 },
>>>>> -       { "v14", 16, -1 },
>>>>> -       { "v15", 16, -1 },
>>>>> -       { "v16", 16, -1 },
>>>>> -       { "v17", 16, -1 },
>>>>> -       { "v18", 16, -1 },
>>>>> -       { "v19", 16, -1 },
>>>>> -       { "v20", 16, -1 },
>>>>> -       { "v21", 16, -1 },
>>>>> -       { "v22", 16, -1 },
>>>>> -       { "v23", 16, -1 },
>>>>> -       { "v24", 16, -1 },
>>>>> -       { "v25", 16, -1 },
>>>>> -       { "v26", 16, -1 },
>>>>> -       { "v27", 16, -1 },
>>>>> -       { "v28", 16, -1 },
>>>>> -       { "v29", 16, -1 },
>>>>> -       { "v30", 16, -1 },
>>>>> -       { "v31", 16, -1 },
>>>>> -       { "fpsr", 4, -1 },
>>>>> -       { "fpcr", 4, -1 },
>>>>> +       { "v0", 16, offsetof(struct fpsimd_state, vregs[0])},
>>>>> +       { "v1", 16, offsetof(struct fpsimd_state, vregs[1])},
>>>>> +       { "v2", 16, offsetof(struct fpsimd_state, vregs[2])},
>>>>> +       { "v3", 16, offsetof(struct fpsimd_state, vregs[3])},
>>>>> +       { "v4", 16, offsetof(struct fpsimd_state, vregs[4])},
>>>>> +       { "v5", 16, offsetof(struct fpsimd_state, vregs[5])},
>>>>> +       { "v6", 16, offsetof(struct fpsimd_state, vregs[6])},
>>>>> +       { "v7", 16, offsetof(struct fpsimd_state, vregs[7])},
>>>>> +       { "v8", 16, offsetof(struct fpsimd_state, vregs[8])},
>>>>> +       { "v9", 16, offsetof(struct fpsimd_state, vregs[9])},
>>>>> +       { "v10", 16, offsetof(struct fpsimd_state, vregs[10])},
>>>>> +       { "v11", 16, offsetof(struct fpsimd_state, vregs[11])},
>>>>> +       { "v12", 16, offsetof(struct fpsimd_state, vregs[12])},
>>>>> +       { "v13", 16, offsetof(struct fpsimd_state, vregs[13])},
>>>>> +       { "v14", 16, offsetof(struct fpsimd_state, vregs[14])},
>>>>> +       { "v15", 16, offsetof(struct fpsimd_state, vregs[15])},
>>>>> +       { "v16", 16, offsetof(struct fpsimd_state, vregs[16])},
>>>>> +       { "v17", 16, offsetof(struct fpsimd_state, vregs[17])},
>>>>> +       { "v18", 16, offsetof(struct fpsimd_state, vregs[18])},
>>>>> +       { "v19", 16, offsetof(struct fpsimd_state, vregs[19])},
>>>>> +       { "v20", 16, offsetof(struct fpsimd_state, vregs[20])},
>>>>> +       { "v21", 16, offsetof(struct fpsimd_state, vregs[21])},
>>>>> +       { "v22", 16, offsetof(struct fpsimd_state, vregs[22])},
>>>>> +       { "v23", 16, offsetof(struct fpsimd_state, vregs[23])},
>>>>> +       { "v24", 16, offsetof(struct fpsimd_state, vregs[24])},
>>>>> +       { "v25", 16, offsetof(struct fpsimd_state, vregs[25])},
>>>>> +       { "v26", 16, offsetof(struct fpsimd_state, vregs[26])},
>>>>> +       { "v27", 16, offsetof(struct fpsimd_state, vregs[27])},
>>>>> +       { "v28", 16, offsetof(struct fpsimd_state, vregs[28])},
>>>>> +       { "v29", 16, offsetof(struct fpsimd_state, vregs[29])},
>>>>> +       { "v30", 16, offsetof(struct fpsimd_state, vregs[30])},
>>>>> +       { "v31", 16, offsetof(struct fpsimd_state, vregs[31])},
>>>>> +       { "fpsr", 4, offsetof(struct fpsimd_state, fpsr)},
>>>>> +       { "fpcr", 4, offsetof(struct fpsimd_state, fpcr)},
>>>>>  };
>>>>>
>>>>>  char *dbg_get_reg(int regno, void *mem, struct pt_regs *regs)
>>>>> @@ -100,11 +108,22 @@ char *dbg_get_reg(int regno, void *mem, struct pt_regs *regs)
>>>>>         if (regno >= DBG_MAX_REG_NUM || regno < 0)
>>>>>                 return NULL;
>>>>>
>>>>> -       if (dbg_reg_def[regno].offset != -1)
>>>>> +       if (dbg_reg_def[regno].offset != -1 && regno < _GP_REGS)
>>>>>                 memcpy(mem, (void *)regs + dbg_reg_def[regno].offset,
>>>>>                        dbg_reg_def[regno].size);
>>>>> -       else
>>>>> +       else {
>>>>> +#ifdef CONFIG_KERNEL_MODE_NEON
>>>>
>>>> else if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON)) {
>>>>
>>>>> +               if (fpsimd_thread_state()) {
>>>>> +                       fpsimd_save_state(&kgdb_fpsimd_regs);
>>>>> +                       memcpy(mem, (void *)&kgdb_fpsimd_regs +
>>>>> +                                       dbg_reg_def[regno].offset,
>>>>> +                                       dbg_reg_def[regno].size);
>>>>
>>>> Why do this conditionally? Shouldn't kgdb return whatever is currently
>>>> present in the register file, regardless of whether 'current' is a
>>>> kernel thread or has had its FPSIMD state swapped out?
>>>> Also, one could wonder why this is dependent on KERNEL_MODE_NEON being set
>>>>
>>>
>>> KGDB, being kernel debugger it was thought that allow
>>> debugging only if kernel mode is supported.
>>> In fact, I proposed (previous email)  to remove this condition and
>>> allow to reading neon
>>> registers always and allow write only if neon is in kernel mode.
>>>
>>
>> I think KGDB is a special case here and should not considered as a
>> user of kernel mode NEON. So even writing the registers should
>> allowable regardless of this setting imo.
>>

 Can we avoid checking on NEON state and allow debugger to
access NEON registers irrespective of state to keep it simple?
and just compile config is enough?

>>> But completely removing this condition check and allowing to modify
>>> is user choice to ensure neon registers state.
>>>
>>> Will, Please let me know your opinion?
>>>
>>>>> +               } else
>>>>> +                       memset(mem, 0, dbg_reg_def[regno].size);
>>>>> +#else
>>>>>                 memset(mem, 0, dbg_reg_def[regno].size);
>>>>> +#endif
>>>>> +       }
>>>>>         return dbg_reg_def[regno].name;
>>>>>  }
>>>>>
>>>>> @@ -113,9 +132,19 @@ int dbg_set_reg(int regno, void *mem, struct pt_regs *regs)
>>>>>         if (regno >= DBG_MAX_REG_NUM || regno < 0)
>>>>>                 return -EINVAL;
>>>>>
>>>>> -       if (dbg_reg_def[regno].offset != -1)
>>>>> +       if (dbg_reg_def[regno].offset != -1 && regno < _GP_REGS)
>>>>>                 memcpy((void *)regs + dbg_reg_def[regno].offset, mem,
>>>>> -                      dbg_reg_def[regno].size);
>>>>> +                       dbg_reg_def[regno].size);
>>>>> +#ifdef CONFIG_KERNEL_MODE_NEON
>>>>> +       else {
>>>>> +               if (fpsimd_thread_state()) {
>>>>
>>>> Again, why the conditional?
>>>>
>>>>> +                       memcpy((void *)&kgdb_fpsimd_regs +
>>>>> +                                       dbg_reg_def[regno].offset,
>>>>> +                                       mem, dbg_reg_def[regno].size);
>>>>> +                       fpsimd_load_state(&kgdb_fpsimd_regs);
>>>>
>>>> Are you sure it is safe to clobber all the registers here? Shouldn't
>>>> you save the state first, do the memcpy() and then load?
>>>>
>>> yes it is safe. gdb read registers after every command so all registers
>>> are saved first before they are modified
>>>
>>
>> Does that imply that -after every command- the whole NEON state (32*16
>> bytes) is saved to kgdb_fpsimd_regs 32 times, each time to access the
>> contents of a single register?
>>
> Yes, I mentioned this overhead in cover letter. There is no single call to
> architecture specific code to read all the registers at once for gdb
> 'register read'
> command. The kgdb makes one call for every register read.
>
> I can think of saving this registers for every debug exception call,
> but that too is not optimized because gdb tool issues multiple step commands
> for every 'next' command.
> IMO, this little overhead is still ok as it is used only in step debugging
>
>> --
>> Ard.



More information about the linux-arm-kernel mailing list