[bug report] GICv4.1: doorbell interrupts will be lost in a corner case

Kunkun Jiang jiangkunkun at huawei.com
Thu Jan 25 05:26:41 PST 2024


Hi Marc,

On 2024/1/24 20:43, Marc Zyngier wrote:
> On Wed, 24 Jan 2024 08:54:24 +0000,
> Kunkun Jiang <jiangkunkun at huawei.com> wrote:
>> Hi all,
>>
>> In chapter 8.5 ("Doorbells") of the GIC spec, the affinity of
>> doorbell interrupt is described like this:
>>
>>> Doorbell interrupts target the Redistributor the vPE is
>>> currently mapped to, based on the previous VMAPP or VMOVP
>>> command for the vPE.
>> The doorbell interrupt here should refer to all types of
>> doorbell interrupt, right?
> There is only one type of doorbell.
>
>> When GICv4.1 is enabled, the doorbell interrupt will be
>> truned on only when kvm handles WFI exit. There is a
>> corner case where will lost doorbell interrupt:
>> 1. doorbell interrupt enabled
>> 2. the cpu which the vPE is mapped to is manually offline
>>    through 'echo 0 > /sys/device/system/cpu/cpuX/online'
>> 3. According to the description of chapter 8.5 ("Doorbells"),
>>    the doorbell interrupt coming at this time will still
>>    be sent to the offline cpu.Then the interrupt will be
>>    lost.
>>
>> Should we add a cpu offline callback to handle the
>> doorbell interrupt mapped to this cpu?
> That seems gross. The right way to do it is to track the affinity of
> the doorbell (which we already do), and let the core code move the
> interrupt somewhere else in this case (which is should already do).
>
> Have you actually witnessed this issue? Or is that just idle
> conjecture?
When cpu offline, all interrupts will be migrated through
irq_migrate_all_off_this_cpu.The doorbell interrupts will
also try to move to other cpu via VMOVP in
its_vpe_set_affinity.

However, according to the current implementation, GICv4.1
allows us to skip VMOVP if moving to a cpu whose RD is
sharing its VPE table with the current one. And I have
verified this:
0. Each cpu die shares a VPE table and contains 32 CPUs
   die0(CPU0-31) die1(CPU32-63)...
1. Enable GICv4.1
2. Create a 1U VM and bind the vcpu to CPU32, doorbell
   affinity to CPU32
3. Bind the vcpu to CPU16, doorbell affinity to CPU16
3. Manually offline CPU16 on the host side.
   echo 0 > /sys/devices/system/cpu/cpu16/online
4. VMOVP will be skipped.
   (The doorbell is migrated to CPU0 by default.)

Looking forward to your reply.

Kunkun Jiang



More information about the linux-arm-kernel mailing list