[PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation
Keisuke Nishimura
keisuke.nishimura at inria.fr
Thu Nov 28 11:04:01 PST 2024
Hello Marc and Oliver,
Thank you for the replies.
On 28/11/2024 18:55, Oliver Upton wrote:
> On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
>> The xa_store() may fail because there is no guarantee that the cache_key
>> index is already used in its->translation_cache. This fix (1) resolves
>> the kref inconsistency on failure and (2) returns the error code.
>
> xa_store() doesn't fail if an entry is already present at the specified
> index. It returns the old entry, which is why we have a vgic_put_irq()
> on the "error" path.
Sure. But to my understanding, this could be the first store to
its->translation_cache with cache_key, and that is why old can be NULL? I am not
very familiar with this code, so please correct me if I am wrong.
>
> Genuine error handling definitely is missing here, but that would only
> happen if the xarray library failed to allocate (-ENOMEM) or if the
> xarray itself is broken beyond repair (-EINVAL).
>
>> Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
>> Signed-off-by: Keisuke Nishimura <keisuke.nishimura at inria.fr>
>> ---
>> arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
>> 1 file changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
>> index 198296933e7e..8f423857b7d2 100644
>> --- a/arch/arm64/kvm/vgic/vgic-its.c
>> +++ b/arch/arm64/kvm/vgic/vgic-its.c
>> @@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
>> return irq;
>> }
>>
>> -static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> +static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> u32 devid, u32 eventid,
>> struct vgic_irq *irq)
>
> This was deliberately made a void return. The entire translation cache
> is opportunistic and not required for functional correctness. Nothing
> breaks if we fail to insert an entry for, say, a failed memory
> allocation.
If my point above is correct, in the next version, we can ensure no memory error
here by calling xa_reserve() in advance. Or, we can ignore the error, and just
make the kref consistent.
>
> It would be extremely helpful if you could share the steps to reproduce
> the error you observe.
>
I encountered this when I studied the usage of xa_*. After some investigation to
confirm that the call of xa_store() might be the first store with cache_key, I
sent this patch. (I apologize if I missed anything.)
Thank you,
Keisuke
More information about the linux-arm-kernel
mailing list