[PATCH v2 2/2] irqchip/gic-v4.1: Fix skipping VMOVP in cpu offline scenario

Kunkun Jiang jiangkunkun at huawei.com
Fri Jan 26 18:59:13 PST 2024


Hi Marc,

On 2024/1/26 18:52, Marc Zyngier wrote:
> On Fri, 26 Jan 2024 10:30:12 +0000,
> Kunkun Jiang <jiangkunkun at huawei.com> wrote:
>> commit dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour
>> of VMOVP") make an optimization, VMOVP can be skipped if moving
>> VPE to a cpu whose RD is sharing its VPE table with the current one.
>>
>> In the cpu offline scenario, the mask_val is the entire cpu range,
>> except for the offline cpu. Therefore, the first cpu is CPU0. In
>> corner case, this may result in lost interrupts:
>> 0. Each cpu die shares a VPE table and contains 32 CPUs
>>     die0(CPU0-31) die1(CPU32-63)...
>> 1. VPE resides on CPU32, doorbell affinity to CPU32.
>> 2. Move VPE to CPU16, doorbell affinity to CPU16.
>> 3. Manually offline CPU16 on the host side:
>>     'echo 0 > /sys/devices/system/cpu/cpu16/online'
>> 4. VMOVP will be skipped.
>> 5. Subsequent doorbell interrupts will be lost.
>>
>> So VMOVP cannot be skipped when the affinity CPU is not in mask_val.
>>
>> Fixes: dd3f050a216e ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP")
>> Signed-off-by: Kunkun Jiang <jiangkunkun at huawei.com>
>> ---
>>   drivers/irqchip/irq-gic-v3-its.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 4b1dbb697959..bfb922f16bc6 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -3817,7 +3817,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
>>   				bool force)
>>   {
>>   	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
>> -	int from, cpu = cpumask_first(mask_val);
>> +	int cur, from, cpu = cpumask_first(mask_val);
>>   	unsigned long flags;
>>   
>>   	/*
>> @@ -3839,11 +3839,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
>>   
>>   	vpe->col_idx = cpu;
>>   
>> +	cur = cpumask_first(irq_data_get_effective_affinity_mask(d));
> When is this not equal to 'from'?
It's based on my last patch. When skipping VMOVP, it is not equal to 'from'.
>>   	/*
>>   	 * GICv4.1 allows us to skip VMOVP if moving to a cpu whose RD
>>   	 * is sharing its VPE table with the current one.
>>   	 */
>>   	if (gic_data_rdist_cpu(cpu)->vpe_table_mask &&
>> +	    cpumask_test_cpu(cur, mask_val) &&
>>   	    cpumask_test_cpu(from, gic_data_rdist_cpu(cpu)->vpe_table_mask))
>>   		goto out;
> This looks utterly pointless to me.
Yes, there are some problems here. I'll think about it more carefully.

Kunkun Jiang



More information about the linux-arm-kernel mailing list