Report an error on GICv4.1 vcpu de-schedule

Jingyi Wang wangjingyi11 at huawei.com
Thu Mar 17 23:14:27 PDT 2022



On 3/17/2022 6:17 PM, Marc Zyngier wrote:
> Hi Jingyi,
> 
> On Thu, 17 Mar 2022 07:27:45 +0000,
> Jingyi Wang <wangjingyi11 at huawei.com> wrote:
>>
>> Hi Marc,
>>
>> The patch "KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
>> bit"(57e3cebd022fbc035dcf190ac789fd2ffc747f5b) remove the polling of
>> GICR_VPENDBASER.Dirty bit in vcpu_load() , while check the VPT parsing
>> ready in kvm_vgic_flush_hwstate() for better performance.
>>
>> Most time it works, but we have met an error on our hardware recently.
>> In preemptable kernel, the vcpu can be preempted between vcpu_load and
>> kvm_vgic_flush_hwstate. As a result, it get de-scheduled and
>> its_clear_vpend_valid() is called
>>
>> 	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>> 	val &= ~GICR_VPENDBASER_Valid;
>> 	val &= ~clr;
>> 	val |= set;
>> 	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
>>
>>
>> The function clears Valid bit meanwhile GICR_VPENDBASER_Dirty
>> maybe still 1, which cause the subsequent GICR_VPENDBASER_Dirty polling
>> fail and report ""ITS virtual pending table not cleaning".
>>
>> We have communicated with Martin from ARM and get the conclusion
>> that we should not change valid bit while the dirty bit not clear——
>> "The dirty bit reports whether the last schedule /de-schedule
>> operation has completed.The restriction on not changing Valid when Dirty
>> is 1, is so that hardware can always complete the last operation for
>> starting the next".
> 
> Indeed, the spec is crystal clear about that, and clearing Valid while
> Dirty is set is plain wrong.
> 
>>
>> I think maybe we can check dirty bit clear before clearing the valid bit
>> in its_clear_vpend_valid() code. Hope to know your opinion about this
>> issue.
> 
> Yes, that's what should happen. I came up with the patch below. Please
> give it a shot and let me know if that helps. If it does, I'll queue
> it as a fix.
> 
> Thanks,
> 
> 	M.
> 

Thanks, we will test that soon.

>>From c23ccc9cfa603e30ac189d43af75f03b60d780bc Mon Sep 17 00:00:00 2001
> From: Marc Zyngier <maz at kernel.org>
> Date: Thu, 17 Mar 2022 09:49:02 +0000
> Subject: [PATCH] irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear
>   before descheduling
> 
> The way KVM drives GICv4.{0,1} is as follows:
> - vcpu_load() makes the VPE resident, instructing the RD to start
>    scanning for interrupts
> - just before entering the guest, we check that the RD has finished
>    scanning and that we can start running the vcpu
> - on preemption, we deschedule the VPE by making it invalid on
>    the RD
> 
> However, we are preemptible between the first two steps. If it so
> happens *and* that the RD was still scanning, we nonetheless write
> to the GICR_VPENDBASER register while Dirty is set, and bad things
> happen (we're in UNPRED land).
> 
> This affects both the 4.0 and 4.1 implementations.
> 
> Make sure Dirty is cleared before performing the deschedule,
> meaning that its_clear_vpend_valid() becomes a sort of full VPE
> residency barrier.
> 
> Reported-by: Jingyi Wang <wangjingyi11 at huawei.com>
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty
> bit")
> Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
> ---
>   drivers/irqchip/irq-gic-v3-its.c | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 9e93ff2b6375..c9b1df980899 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3011,18 +3011,12 @@ static int __init allocate_lpi_tables(void)
>   	return 0;
>   }
>   
> -static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +static u64 read_vpend_dirty_clear(void __iomem *vlpi_base)
>   {
>   	u32 count = 1000000;	/* 1s! */
>   	bool clean;
>   	u64 val;
>   
> -	val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
> -	val &= ~GICR_VPENDBASER_Valid;
> -	val &= ~clr;
> -	val |= set;
> -	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> -
>   	do {
>   		val = gicr_read_vpendbaser(vlpi_base + GICR_VPENDBASER);
>   		clean = !(val & GICR_VPENDBASER_Dirty);
> @@ -3033,10 +3027,26 @@ static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
>   		}
>   	} while (!clean && count);
>   
> -	if (unlikely(val & GICR_VPENDBASER_Dirty)) {
> +	if (unlikely(!clean))
>   		pr_err_ratelimited("ITS virtual pending table not cleaning\n");
> +
> +	return val;
> +}
> +
> +static u64 its_clear_vpend_valid(void __iomem *vlpi_base, u64 clr, u64 set)
> +{
> +	u64 val;
> +
> +	/* Make sure we wait until the RD is done with the initial scan */
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	val &= ~GICR_VPENDBASER_Valid;
> +	val &= ~clr;
> +	val |= set;
> +	gicr_write_vpendbaser(val, vlpi_base + GICR_VPENDBASER);
> +
> +	val = read_vpend_dirty_clear(vlpi_base);
> +	if (unlikely(val & GICR_VPENDBASER_Dirty))
>   		val |= GICR_VPENDBASER_PendingLast;
> -	}
>   
>   	return val;
>   }
> 



More information about the linux-arm-kernel mailing list