[PATCH] ARM64: errata: Add workaround for HIP10/HIP10C erratum 162200803

Thu Jun 26 23:36:31 PDT 2025

On 2025/6/26 21:27, Marc Zyngier wrote:
> On Thu, 26 Jun 2025 13:41:42 +0100,
> Zhou Wang <wangzhou1 at hisilicon.com> wrote:
>>
>> For GICv4.0 of Hip10 and Hip10C, it has a SoC bug with vPE schedule:
>> when multiple vPEs are sending vpe schedule/deschedule commands
>> concurrently and repeatedly, some vPE schedule command may not be
>> scheduled, and it will cause the command timeout.
>>
>> The hardware implementation is that there is one GIC hardware in one CPU die,
>> which handles all vPE schedule operations one by one in all CPUs of this die.
>> The bug is that if the number of queued vPE schedule operations is more
>> than a certain value, the last vPE schedule operation will be lost.
>>
>> One possible way to solve this problem is to limit the number of vLPIs, so
>> the hardware could spend less time to scan virtual pending table when it
>> handles the vPE schedule operations, so the queued vPE schedule operations
>> will never be more than above certain value.
>>
>> Given the number of CPUs of die, and imagine there is 100 vPE schedule
>> operations per second one CPU, it can be calculated that we can limit
>> the number of vLPI to 4096 for virtual machine to avoid the issue.
>>
>> Signed-off-by: Zhou Wang <wangzhou1 at hisilicon.com>
>> ---
>>  Documentation/arch/arm64/silicon-errata.rst |  2 ++
>>  arch/arm64/Kconfig                          | 12 ++++++++++++
>>  arch/arm64/include/asm/cputype.h            |  4 ++++
>>  arch/arm64/kernel/cpu_errata.c              | 15 +++++++++++++++
>>  arch/arm64/kvm/vgic/vgic-mmio-v3.c          |  5 +++++
>>  arch/arm64/tools/cpucaps                    |  1 +
>>  include/linux/irqchip/arm-gic-v3.h          |  1 +
>>  7 files changed, 40 insertions(+)
>>
> 
> [...]
> 
>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> index ae4c0593d114..495a56e9dc4b 100644
>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> @@ -81,6 +81,11 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>  		if (vgic_has_its(vcpu->kvm)) {
>>  			value |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
>>  			value |= GICD_TYPER_LPIS;
>> +			/* Limit the number of vlpis to 4096 */
>> +			if (cpus_have_final_cap(ARM64_WORKAROUND_HISI_162200803) &&
>> +			    kvm_vgic_global_state.has_gicv4 &&
>> +			    !kvm_vgic_global_state.has_gicv4_1)
>> +				value |= 11 << GICD_TYPER_NUM_LPIS_SHIFT;
> 
> This really doesn't solve your problem. Yes, the guest *may* honor
> this limit. But KVM doesn't care and will happily allocate 2^16 vLPIs
> if the guest asks -- there is no code enforcing this limit.

Hi Marc,

I am not sure if there is any other place guest can ask vLPI over
the limitation except for MAPTI/MAPT below?

> And even if we did. What would we do on a MAPTI command that tries to
> map a vLPI outside of the allowed range? Do we need to tell the guest
> it has screwed up?

Thanks for pointing this. Yes, we miss the lpi_nr checking in vgic_its_cmd_handle_mapi.
In fact, the fix of this errata introduces the usage of GICD.num_LPI,
so we need make related logic right as well.

I am not sure that if we could add related checking for lpi_nr in MAPTI/MAPI
as part of this errata fix, or we should add the basic support for
GICD.num_LPI before adding this errata?

Thanks,
Zhou

> 
> Thanks,
> 
> 	M.
>