Observing Softlockup's while running heavy IOs

Tue Sep 6 04:22:37 PDT 2016

On Fri, Sep 2, 2016 at 4:34 AM, Bart Van Assche
<bart.vanassche at sandisk.com> wrote:
> On 09/01/2016 03:31 AM, Sreekanth Reddy wrote:
>>
>> I reduced the ISR workload by one third in-order to reduce the time
>> that is spent per CPU in interrupt context, even then I am observing
>> softlockups.
>>
>> As I mentioned before only same single CPU in the set of CPUs(enabled
>> in affinity_hint) is busy with handling the interrupts from
>> corresponding IRQx. I have done below experiment in driver to limit
>> these softlockups/hardlockups. But I am not sure whether it is
>> reasonable to do this in driver,
>>
>> Experiment:
>> If the CPUx is continuously busy with handling the remote CPUs
>> (enabled in the corresponding IRQ's affinity_hint) IO works by 1/4th
>> of the HBA queue depth in the same ISR context then enable a flag
>> called 'change_smp_affinity' for this IRQ. Also created a thread with
>> will poll for this flag for every IRQ's (enabled by driver) for every
>> second. If this thread see that this flag is enabled for any IRQ then
>> it will write next CPU number from the CPUs enabled in the IRQ's
>> affinity_hint to the IRQ's smp_affinity procfs attribute using
>> 'call_usermodehelper()' API.
>>
>> This to make sure that interrupts are not processed by same single CPU
>> all the time and to make the other CPUs to handle the interrupts if
>> the current CPU is continuously busy with handling the other CPUs IO
>> interrupts.
>>
>> For example consider a system which has 8 logical CPUs and one MSIx
>> vector enabled (called IRQ 120) in driver, HBA queue depth as 8K.
>> then IRQ's procfs attributes will be
>> IRQ# 120, affinity_hint=0xff, smp_affinity=0x00
>>
>> After starting heavy IOs, we will observe that only CPU0 will be busy
>> with handling the interrupts. This experiment driver will change the
>> smp_affinity to next CPU number i.e. 0x01 (using cmd 'echo 0x01 >
>> /proc/irq/120/smp_affinity', driver issue's this cmd using
>> call_usermodehelper() API) if it observes that CPU0 is continuously
>> processing more than 2K of IOs replies of other CPUs i.e from CPU1 to
>> CPU7.
>>
>> Whether doing this kind of stuff in driver is ok?
>
>
> Hello Sreekanth,
>
> To me this sounds like something that should be implemented in the I/O
> chipset on the motherboard. If you have a look at the Intel Software
> Developer Manuals then you will see that logical destination mode supports
> round-robin interrupt delivery. However, the Linux kernel selects physical
> destination mode on systems with more than eight logical CPUs (see also
> arch/x86/kernel/apic/apic_flat_64.c).
>
> I'm not sure the maintainers of the interrupt subsystem would welcome code
> that emulates round-robin interrupt delivery. So your best option is
> probably to minimize the amount of work that is done in interrupt context
> and to move as much work as possible out of interrupt context in such a way
> that it can be spread over multiple CPU cores, e.g. by using
> queue_work_on().
>
> Bart.

Bart,

Thanks a lot for providing lot of inputs and valuable information on this issue.

Today I got one more observation. i.e. I am not observing any lockups
if I use 1.0.4-6 versioned irqbalance.
Since this versioned irqbalance is able to shift the load to other CPU
when one CPU is heavily loaded.

while running heavy IOs, for first few seconds here is my driver irq's
attributes,
--------------------------------------------------------------------------------------------------------------------
ioc number = 0
number of core processors = 24
msix vector count = 2
number of cores per msix vector = 16

    msix index = 0, irq number =  50, smp_affinity = 000040
affinity_hint = 000fff
    msix index = 1, irq number =  51, smp_affinity = 001000
affinity_hint = fff000

We have set affinity for 2 msix vectors and 24 core processors
----------------------------------------------------------------------------------------------------------------------

After few seconds it observed that CPU12 is heavily loaded for IRQ 51
and it changed the smp_affinity to CPU21
--------------------------------------------------------------------------------------------------------------------
ioc number = 0
number of core processors = 24
msix vector count = 2
number of cores per msix vector = 16

    msix index = 0, irq number =  50, smp_affinity = 000040
affinity_hint = 000fff
    msix index = 1, irq number =  51, smp_affinity = 200000
affinity_hint = fff000

We have set affinity for 2 msix vectors and 24 core processors
---------------------------------------------------------------------------------------------------------------------

Where as irqblanance versioned 1.0.9 is not able to shift the load to
the other CPUs enabled in the affinity_hint (even when subset policy
is enabled) and so I was observing the softlocks/hardlockups.

Here I have attached irqbalance logs with debug enabled for both versions.

Thanks,
Sreekanth
-------------- next part --------------
A non-text attachment was scrubbed...
Name: irqbalance_1.0.4_logs
Type: application/octet-stream
Size: 397312 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/irqbalance/attachments/20160906/f96fada4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: irqbalance_1.0.9_logs
Type: application/octet-stream
Size: 491520 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/irqbalance/attachments/20160906/f96fada4/attachment-0003.obj>