EDAC driver for ARMv8 L1/L2 cache

York Sun york.sun at nxp.com
Thu Feb 8 07:53:18 PST 2018


On 02/08/2018 07:33 AM, James Morse wrote:
> Hi York,
> 
> On 01/02/18 20:56, York Sun wrote:
>> On 01/15/2018 03:52 PM, Borislav Petkov wrote:
>>> On Mon, Jan 15, 2018 at 11:28:14PM +0000, York Sun wrote:
>>>> It is generic ARM64 thing. I believe only SError interrupt is available.
>>>
>>> So if it is, then I'd suggest you hammer out a proper design with the
>>> ARM folks.
> 
>> I made some progress and need some help on coding. On the platform I am
>> working on, it has A53 cores. Each A53 core has a signal nINTERRIRQ.
>> They are connected to one GIC interrupt.
> 
> Is this a fatal signal for the CPU that should have received it? The signals
> start out as being per-cpu, but configured like this you can only take the
> interrupt one one CPU...
> 
> (is this thing edge or level triggered?)
> 

Level

> 
>> I managed to inject errors to some safe address without triggering system
>> error and I got the interrupt.
> 
> (okay, sounds like its a corrected error)


Double-bit error. I got the interrupt nINTERRIRQ.

> 
> 
>> I will need to find out which core has errors by reading
>> register on each core (and clear the interrupt). How can I do this
>> within interrupt service routine? I tried to use
>> smp_call_function_single() but it doesn't like the IRQ being disabled.
> 
> mm/memory-failure.c:memory_failure_queue() has an example of how you could do
> this. It uses 'schedule_work_on()' to re-run after the IRQ, from there you
> should be able to call something like on_each_cpu() or smp_call_function_many().
> 
> If this thing is level triggered you can't escape the irq-handler until its
> cleared. If it needs clearing by a remote CPU this is going to be a problem.
> 

Yeah. For now I use smp_call_function_single_async() and it looks OK. I
will send out patch for review after clearing some other hardware
questions with ARM support.

York



More information about the linux-arm-kernel mailing list