[PATCH] ARM64: errata: Add workaround for HIP10/HIP10C erratum 162200803
Zhou Wang
wangzhou1 at hisilicon.com
Wed Jul 2 02:57:13 PDT 2025
On 2025/7/1 16:14, Marc Zyngier wrote:
> On Fri, 27 Jun 2025 07:36:31 +0100,
> Zhou Wang <wangzhou1 at hisilicon.com> wrote:
>>
>> On 2025/6/26 21:27, Marc Zyngier wrote:
>>> On Thu, 26 Jun 2025 13:41:42 +0100,
>>> Zhou Wang <wangzhou1 at hisilicon.com> wrote:
>>>>
>>>> For GICv4.0 of Hip10 and Hip10C, it has a SoC bug with vPE schedule:
>>>> when multiple vPEs are sending vpe schedule/deschedule commands
>>>> concurrently and repeatedly, some vPE schedule command may not be
>>>> scheduled, and it will cause the command timeout.
>>>>
>>>> The hardware implementation is that there is one GIC hardware in one CPU die,
>>>> which handles all vPE schedule operations one by one in all CPUs of this die.
>>>> The bug is that if the number of queued vPE schedule operations is more
>>>> than a certain value, the last vPE schedule operation will be lost.
>>>>
>>>> One possible way to solve this problem is to limit the number of vLPIs, so
>>>> the hardware could spend less time to scan virtual pending table when it
>>>> handles the vPE schedule operations, so the queued vPE schedule operations
>>>> will never be more than above certain value.
>>>>
>>>> Given the number of CPUs of die, and imagine there is 100 vPE schedule
>>>> operations per second one CPU, it can be calculated that we can limit
>>>> the number of vLPI to 4096 for virtual machine to avoid the issue.
>>>>
>>>> Signed-off-by: Zhou Wang <wangzhou1 at hisilicon.com>
>>>> ---
>>>> Documentation/arch/arm64/silicon-errata.rst | 2 ++
>>>> arch/arm64/Kconfig | 12 ++++++++++++
>>>> arch/arm64/include/asm/cputype.h | 4 ++++
>>>> arch/arm64/kernel/cpu_errata.c | 15 +++++++++++++++
>>>> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 5 +++++
>>>> arch/arm64/tools/cpucaps | 1 +
>>>> include/linux/irqchip/arm-gic-v3.h | 1 +
>>>> 7 files changed, 40 insertions(+)
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> index ae4c0593d114..495a56e9dc4b 100644
>>>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>>>> @@ -81,6 +81,11 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>>> if (vgic_has_its(vcpu->kvm)) {
>>>> value |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
>>>> value |= GICD_TYPER_LPIS;
>>>> + /* Limit the number of vlpis to 4096 */
>>>> + if (cpus_have_final_cap(ARM64_WORKAROUND_HISI_162200803) &&
>>>> + kvm_vgic_global_state.has_gicv4 &&
>>>> + !kvm_vgic_global_state.has_gicv4_1)
>>>> + value |= 11 << GICD_TYPER_NUM_LPIS_SHIFT;
>>>
>>> This really doesn't solve your problem. Yes, the guest *may* honor
>>> this limit. But KVM doesn't care and will happily allocate 2^16 vLPIs
>>> if the guest asks -- there is no code enforcing this limit.
>>
>> Hi Marc,
>>
>> I am not sure if there is any other place guest can ask vLPI over
>> the limitation except for MAPTI/MAPT below?
>>
>>> And even if we did. What would we do on a MAPTI command that tries to
>>> map a vLPI outside of the allowed range? Do we need to tell the guest
>>> it has screwed up?
>>
>> Thanks for pointing this. Yes, we miss the lpi_nr checking in vgic_its_cmd_handle_mapi.
>> In fact, the fix of this errata introduces the usage of GICD.num_LPI,
>> so we need make related logic right as well.
>
> Exactly.
>
>>
>> I am not sure that if we could add related checking for lpi_nr in MAPTI/MAPI
>> as part of this errata fix, or we should add the basic support for
>> GICD.num_LPI before adding this errata?
>
> You definitely need to handle that before allowing such limit to be
> enforced. Which also means allowing the limit to be saved/restored
> from userspace in order to support migration.
Seems that in KVM we do not consider GICD_TYPER in migration. How about
making GICD_TYPER.num_LPIs as a default configuration, when KVM version is
same between source and destination during migration, the logic is still right.
Something like:
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index eb1205654ac8..2071b1445b22 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -385,6 +385,7 @@ int vgic_init(struct kvm *kvm)
/* freeze the number of spis */
if (!dist->nr_spis)
dist->nr_spis = VGIC_NR_IRQS_LEGACY - VGIC_NR_PRIVATE_IRQS;
+ dist->nr_lpis = 2 ^ (INTERRUPT_NUM_LPIS + 1);
ret = kvm_vgic_dist_init(kvm, dist->nr_spis);
if (ret)
@@ -433,6 +434,7 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
kfree(dist->spis);
dist->spis = NULL;
dist->nr_spis = 0;
+ dist->nr_lpis = 0;
dist->vgic_dist_base = VGIC_ADDR_UNDEF;
if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 534049c7c94b..c770eadc5188 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -1047,7 +1047,8 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
else
lpi_nr = event_id;
if (lpi_nr < GIC_LPI_OFFSET ||
- lpi_nr >= max_lpis_propbaser(kvm->arch.vgic.propbaser))
+ lpi_nr >= max_lpis_propbaser(kvm->arch.vgic.propbaser) ||
+ lpi_nr >= GIC_LPI_OFFSET + kvm->arch.vgic.nr_lpis)
return E_ITS_MAPTI_PHYSICALID_OOR;
/* If there is an existing mapping, behavior is UNPREDICTABLE. */
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index ae4c0593d114..224d0d88c823 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -81,6 +81,7 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
if (vgic_has_its(vcpu->kvm)) {
value |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
value |= GICD_TYPER_LPIS;
+ value |= (ilog2(vgic->nr_lpis) - 1) << 11;
} else {
value |= (INTERRUPT_ID_BITS_SPIS - 1) << 19;
}
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 4349084cb9a6..e11792dafcdf 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -16,6 +16,7 @@
#define INTERRUPT_ID_BITS_SPIS 10
#define INTERRUPT_ID_BITS_ITS 16
+#define INTERRUPT_NUM_LPIS 14
#define VGIC_LPI_MAX_INTID ((1 << INTERRUPT_ID_BITS_ITS) - 1)
#define VGIC_PRI_BITS 5
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4a34f7f0a864..b637dc9460d9 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -296,6 +296,7 @@ struct vgic_dist {
* else.
*/
struct its_vm its_vm;
+ int nr_lpis;
};
However,migration between different KVMs will be broken :(
I am not sure that should we consider this case as well?
Thanks,
Zhou
>
> I was really hoping to never have to support this thing (it really is
> terrible), but if we have to introduce and honor it for correctness
> reasons, then it has to be fully supported>
> Thanks,
>
> M.
>
More information about the linux-arm-kernel
mailing list