[PATCH] irqchip/gic-v4: Fix vcpus racing for vpe->col_idx in vmapp and vmovp
Tangnianyao
tangnianyao at huawei.com
Thu Jul 4 05:42:23 PDT 2024
On 7/1/2024 19:20, Marc Zyngier wrote:
> On Mon, 01 Jul 2024 08:23:05 +0100,
> Nianyao Tang <tangnianyao at huawei.com> wrote:
>> its_map_vm may modify vpe->col_idx without holding vpe->vpe_lock.
>> It would result in a vpe resident on one RD after vmovp to a different RD.
>> Or, a vpe maybe vmovp to a RD same as it is current mapped in vpe table.
>>
>> On a 2-ITS, GICv4 enabled system, 32 vcpus deployed on cpu of collection 0
>> and 1. Two pci devices route VLPIs, using each of the ITS.
>> VPE ready to reside on RD1 may have such unexpected case because another
>> vcpu on other cpu is doing vmapp and modify his vpe->col_idx.
>>
>> Unexpected Case 1:
>> RD 0 1
>> vcpu_load
>> lock vpe_lock
>> vpe->col_idx = 1
>> its_map_vm
>> lock vmovp_lock
>> waiting vmovp_lock
>> vpe->col_idx = 0
>> (cpu0 is first online cpu)
>> vmapp vpe on col0
>> unlock vmovp_lock
>> lock vmovp_lock
>> vmovp vpe to col0
>> unlock vmovp_lock
>> vpe resident here fail to
>> receive VLPI!
>>
>> Unexpected Case 2:
>> RD 0 1
>> its_map_vm vcpu_load
>> lock vmovp_lock lock vpe_lock
>> vpe->col_idx = 0
>> vpe->col_idx = 1
>> vmapp vpe on col1 waiting vmovp_lock
>> unlock vmovp_lock
>> lock vmovp_lock
>> vmovp vpe to col1
>> (target RD == source RD!)
>> unlock vmovp_lock
> Why is this second case a problem? What is the conclusion? This commit
> message doesn't explain how you are solving it.
The second case may not a problem if ITS does not check "target RD == source RD".
I tried to protect vpe->col_idx with vpe->vpe_lock.
>
>>
>>
>> Signed-off-by: Nianyao Tang <tangnianyao at huawei.com>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 18 +++++++++++++-----
>> 1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index f99c0a86320b..adda9824e0e7 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -1794,11 +1794,15 @@ static bool gic_requires_eager_mapping(void)
>> static void its_map_vm(struct its_node *its, struct its_vm *vm)
>> {
>> unsigned long flags;
>> + bool vm_mapped_on_any_its = false;
>> + int i;
>>
>> if (gic_requires_eager_mapping())
>> return;
>>
>> - raw_spin_lock_irqsave(&vmovp_lock, flags);
>> + for (i = 0; i < GICv4_ITS_LIST_MAX; i++)
>> + if (vm->vlpi_count[i] > 0)
>> + vm_mapped_on_any_its = true;
> What makes you think that dropping the vmovp lock is a good idea? What
> if you have a concurrent unmap?
I try to protect both vpe->col_idx and vmovp/vmapp with vpe->vpe_lock.
Haven't considered unmap process, sorry.
>
>>
>> /*
>> * If the VM wasn't mapped yet, iterate over the vpes and get
>> @@ -1813,15 +1817,19 @@ static void its_map_vm(struct its_node *its, struct its_vm *vm)
>> struct its_vpe *vpe = vm->vpes[i];
>> struct irq_data *d = irq_get_irq_data(vpe->irq);
>>
>> - /* Map the VPE to the first possible CPU */
>> - vpe->col_idx = cpumask_first(cpu_online_mask);
>> + raw_spin_lock_irqsave(&vpe->vpe_lock, flags);
>> +
>> + if (!vm_mapped_on_any_its) {
>> + /* Map the VPE to the first possible CPU */
>> + vpe->col_idx = cpumask_first(cpu_online_mask);
>> + }
> If the issue is that the target RD isn't initialised before we issue a
> VMAPP, then why not initialising it right from the start?
Yes, initialize vpe->col_idx from the start is better.
It's better not modifying vpe->col_idx in lazy vmapp.
>
>> its_send_vmapp(its, vpe, true);
>> its_send_vinvall(its, vpe);
>> irq_data_update_effective_affinity(d, cpumask_of(vpe->col_idx));
>> +
>> + raw_spin_unlock_irqrestore(&vpe->vpe_lock, flags);
>> }
>> }
>> -
>> - raw_spin_unlock_irqrestore(&vmovp_lock, flags);
>> }
>>
>> static void its_unmap_vm(struct its_node *its, struct its_vm *vm)
> I don't think this patch makes much sense. It opens an even bigger
> race between map and unmap, and I really don't think we want that. The
> main issue is that you are trying to avoid fixing the root cause of
> the problem...
My mistake, haven't consider racing between map and unmap. I just try to avoid
racing between map and vmovp.
> The first course of action is to move the vpe->col_idx init and
> affinity update to activation time, which would make all
> implementations (v4 with and without VMOVP list, v4.1) behave the
> same. On its own, this solves the biggest issue.
>
> The other thing would be to ensure that this lazy VMAPP cannot take
> place concurrently with vcpu_load(). We can't take the VPE lock after
> the vmovp_lock, as this violates the lock ordering established in
> its_vpe_set_affinity(), which locks the VPE before doing anything else.
>
> A possibility would be to take *all* vpe locks *before* taking the
> vmovp lock, ensuring that vmapp sees something consistent. But that's
> potentially huge, and likely to cause stalls on a busy VM. Instead, a
> per-VM lock would do the trick and allow each VPE lock to be taken in
> turn.
>
> With that, we can fix the locking correctly, and make sure that
> vcpu_load and vmapp are mutually exclusive.
>
> I've written the small patch series below (compile-tested only).
> Please let me know how it fares for you.
I've test the patch series below. It works and fix my issue.
>
> M.
>
> >From a9857d782fc649bc08db953477bed9b8c3421118 Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz at kernel.org>
> Date: Mon, 1 Jul 2024 10:39:19 +0100
> Subject: [PATCH 1/3] irqchip/gic-v4: Always configure affinity on VPE
> activation
>
> We currently have two paths to set the initial affinity of a VPE:
>
> - at activation time on GICv4 without the stupid VMOVP list, and
> on GICv4.1
>
> - at map time for GICv4 with VMOVP list
>
> The latter location may end-up modifying the affinity of VPE
> that is currently running, making the results unpredictible.
>
> Instead, unify the two paths, making sure we only set the
> initial affinity at activation time.
>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 3c755d5dad6e6..a00c5e8c4ea65 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1809,13 +1809,9 @@ static void its_map_vm(struct its_node *its, struct its_vm *vm)
>
> for (i = 0; i < vm->nr_vpes; i++) {
> struct its_vpe *vpe = vm->vpes[i];
> - struct irq_data *d = irq_get_irq_data(vpe->irq);
>
> - /* Map the VPE to the first possible CPU */
> - vpe->col_idx = cpumask_first(cpu_online_mask);
> its_send_vmapp(its, vpe, true);
> its_send_vinvall(its, vpe);
> - irq_data_update_effective_affinity(d, cpumask_of(vpe->col_idx));
> }
> }
>
> @@ -4562,6 +4558,10 @@ static int its_vpe_irq_domain_activate(struct irq_domain *domain,
> struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> struct its_node *its;
>
> + /* Map the VPE to the first possible CPU */
> + vpe->col_idx = cpumask_first(cpu_online_mask);
> + irq_data_update_effective_affinity(d, cpumask_of(vpe->col_idx));
> +
> /*
> * If we use the list map, we issue VMAPP on demand... Unless
> * we're on a GICv4.1 and we eagerly map the VPE on all ITSs
> @@ -4570,9 +4570,6 @@ static int its_vpe_irq_domain_activate(struct irq_domain *domain,
> if (!gic_requires_eager_mapping())
> return 0;
>
> - /* Map the VPE to the first possible CPU */
> - vpe->col_idx = cpumask_first(cpu_online_mask);
> -
> list_for_each_entry(its, &its_nodes, entry) {
> if (!is_v4(its))
> continue;
> @@ -4581,8 +4578,6 @@ static int its_vpe_irq_domain_activate(struct irq_domain *domain,
> its_send_vinvall(its, vpe);
> }
>
> - irq_data_update_effective_affinity(d, cpumask_of(vpe->col_idx));
> -
> return 0;
> }
>
[...]
More information about the linux-arm-kernel
mailing list