[PATCH] KVM: arm/arm64: vgic-its: Add error handling in vgic_its_cache_translation

Keisuke Nishimura keisuke.nishimura at inria.fr
Thu Nov 28 11:04:01 PST 2024


Hello Marc and Oliver,

Thank you for the replies.

On 28/11/2024 18:55, Oliver Upton wrote:
> On Thu, Nov 28, 2024 at 02:45:34PM +0100, Keisuke Nishimura wrote:
>> The xa_store() may fail because there is no guarantee that the cache_key
>> index is already used in its->translation_cache. This fix (1) resolves
>> the kref inconsistency on failure and (2) returns the error code.
> 
> xa_store() doesn't fail if an entry is already present at the specified
> index. It returns the old entry, which is why we have a vgic_put_irq()
> on the "error" path.

Sure. But to my understanding, this could be the first store to 
its->translation_cache with cache_key, and that is why old can be NULL? I am not 
very familiar with this code, so please correct me if I am wrong.

> 
> Genuine error handling definitely is missing here, but that would only
> happen if the xarray library failed to allocate (-ENOMEM) or if the
> xarray itself is broken beyond repair (-EINVAL).
> 
>> Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
>> Signed-off-by: Keisuke Nishimura <keisuke.nishimura at inria.fr>
>> ---
>>   arch/arm64/kvm/vgic/vgic-its.c | 16 ++++++++++++----
>>   1 file changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
>> index 198296933e7e..8f423857b7d2 100644
>> --- a/arch/arm64/kvm/vgic/vgic-its.c
>> +++ b/arch/arm64/kvm/vgic/vgic-its.c
>> @@ -555,7 +555,7 @@ static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
>>   	return irq;
>>   }
>>   
>> -static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> +static int vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>>   				       u32 devid, u32 eventid,
>>   				       struct vgic_irq *irq)
> 
> This was deliberately made a void return. The entire translation cache
> is opportunistic and not required for functional correctness. Nothing
> breaks if we fail to insert an entry for, say, a failed memory
> allocation.

If my point above is correct, in the next version, we can ensure no memory error 
here by calling xa_reserve() in advance. Or, we can ignore the error, and just 
make the kref consistent.

> 
> It would be extremely helpful if you could share the steps to reproduce
> the error you observe.
> 

I encountered this when I studied the usage of xa_*. After some investigation to 
confirm that the call of xa_store() might be the first store with cache_key, I 
sent this patch. (I apologize if I missed anything.)

Thank you,
Keisuke



More information about the linux-arm-kernel mailing list