Report an error on GICv4.1 vcpu de-schedule
Jingyi Wang
wangjingyi11 at huawei.com
Thu Mar 17 23:14:27 PDT 2022
On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
>
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11 at huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> val &= ~GICR_VPENDBASER_Valid;
>> val &= ~clr;
>> val |= set;
>> gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
>
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
>
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
>
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
>
> Thanks,
>
> M.
>
Thanks, we will test that soon.
>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz at kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
> before descheduling
>
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
> scanning for interrupts
> - just before entering the guest, we check that the RD has finished
> scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
> the RD
>
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
>
> This affects both the 4.0 and 4.1 implementations.
>
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
>
> Reported-by: Jingyi Wang <wangjingyi11 at huawei.com>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
> drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
> 1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
> return 0;
> }
>
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
> {
> u32 count = 1000000; /* 1s! */
> bool clean;
> u64 val;
>
> - val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> - val &= ~GICR_VPENDBASER_Valid;
> - val &= ~clr;
> - val |= set;
> - gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
> do {
> val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> }
> } while (!clean && count);
>
> - if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> + if (unlikely(!clean))
> pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> + return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> + u64 val;
> +
> + /* Make sure we wait until the RD is done with the initial scan */
> + val = read_vpend_dirty_clear(vlpi_base);
> + val &= ~GICR_VPENDBASER_Valid;
> + val &= ~clr;
> + val |= set;
> + gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> + val = read_vpend_dirty_clear(vlpi_base);
> + if (unlikely(val & GICR_VPENDBASER_Dirty))
> val |= GICR_VPENDBASER_PendingLast;
> - }
>
> return val;
> }
>
More information about the linux-arm-kernel
mailing list