Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
Pratyush Anand
panand at redhat.com
Sun Jan 18 22:10:08 PST 2015
On Friday 16 January 2015 09:52 PM, Will Deacon wrote:
> On Fri, Jan 16, 2015 at 12:00:09PM +0000, Pratyush Anand wrote:
>> On Thursday 15 January 2015 10:17 PM, Pratyush Anand wrote:
>>> On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
>>>> I will still try to find some way to capture enable_dbg macro path.H
>>>
>>> I did instrumented debug tap points at all the location from where
>>> enable_debug macro is called(see attached debug patch). But, I do not
>>> see that, execution reaches to any of those tap points between el0_dbg
>>> and el1_dbg, and tap points debug log also confirms that el1_dbg is
>>> raised before el0_dbg is returned.
>>
>> Probably we all missed this, ARMv8 specs is very clear about it. In
>> section "D2.1 About debug exceptions" it says:
>>
>> Software Breakpoint Instruction exceptions cannot be masked. The PE
>> takes Software Breakpoint Instruction exceptions regardless of both of
>> the following:
>> • The current Exception level.
>> • The current Security state.
>
> Ah, of course, I completely forgot you were using software breakpoints!
>
>> So, reception of el1_dbg while executing el0_dbg seems perfectly normal
>> to me. If you agree then I am back with the original query which I asked
>> in the beginning of the
>> thread,(http://permalink.gmane.org/gmane.linux.ports.arm.kernel/383672)
>> ie how can instruction_pointer be wrong when second el1_dbg is called
>> recursively(as follows).
>>
>> [1]-> el0_dbg (After executing BRK instruction by user)
>> [2] -> el1_dbg (when uprobe break handler at [1] executes BRK instruction)
>> (At the end of this ELR_EL1 is programmed with fffffdfffc000004)
>> [3] -> el1_dbg (when kprobe break handler at [2] enables single stepping)
>> (Here ELR_EL1 was found fffffe0000092470).So When this el1_dbg was
>> received, then regs->pc values are not same what was programmed in
>> ELR_EL1 at the return of [2].
>
> Perhaps you're not removing the BRK instruction properly, and so you try to
> single-step a trapping instruction and end up stepping into the exception?
>
No, probably that is not the scenario. One thing I agree, that even if
AARCH64 specs says that SW BRK exception can not be masked, current
kernel code is not ready to handle re-entrant software debug exception.
So, I will keep those part of uprobe code as non-kprobable, and then its
not so important to get into it for code development perspective.
However, it would be good to understand that what went wrong and caused
to receive an el1_inval. I still fail to pin point the reason of current
issue and its not single stepping a trapping instruction (BRK). Sorry,
but please have a relook at the sequence of events:
1. 1st instruction of uprobe_breakpoint_handler is:
ffffffc00059a628: a9bf7bfd stp x29, x30, [sp,#-16]!
which is replaced by BRK64_OPCODE_KPROBES = 0xD4200080, when Kprobe is
instrumented.
2. User instruction at address 0x4005d0 is replaced by
BRK64_OPCODE_UPROBES = 0xD4200100, when uprobe is instrumented.
3. When application executes instruction at 0x4005d0,we receive el0_dbg.
4. In el0_dbg handler we execute kernel code at address
ffffffc00059a628, so el1_dbg is raised. (I agree here that el0_dbg has
not been closed properly, which current entry.S code expects, so we will
need to fix it if we consensus to support re-entrant software debug
exception, how ever the issue which I see seems unrelated, so...)
5. Now in el1_dbg, we handle kprobe_breakpoint_handler, where we write
saved instruction (ie a9bf7bfd stp x29, x30, [sp,#-16]!) to
the kmalloc allocated address fffffdfffc000004. kprobe code does
flush_icache_range on this location. regs->pc is set to
fffffdfffc000004, so elr_el1 is programmed with fffffdfffc000004 during
kernel_exit. I have cross checked elr_el1 value just before eret is
executed in kernel_exit, and it is correct.
So, here we are trying to single step a STP instruction and not BRK
instruction.
6. Here I am expecting a single step exception, but I receive a el1_inv
with ESR_EL1(0x86000007) ie EC as "ESR_EL1_EC_IABT_EL1" and IFSC as
"Translation fault, third level". WHY????
As soon as enable_dbg is called in el1_inv, we receive next single step
exception, with ELR_EL1 value as next instruction address after
enable_dbg of el1_inv.
Had we received single step instead of el1_inv with correct elr_el1,
kprobe_single_step_handler would have executed properly and we would
have come back to address ffffffc00059a62C (2nd instruction of
uprobe_breakpoint_handler) after returning from this kprobe single step
handler. [off-course fix would be needed to correctly come back to this
address and then also for returning to user space]
~Pratyush
More information about the linux-arm-kernel
mailing list