[PATCH v5 19/22] KVM: arm64: vgic-its: ITT save and restore
Auger Eric
eric.auger at redhat.com
Wed May 3 14:55:34 PDT 2017
Hi Christoffer,
On 03/05/2017 18:37, Christoffer Dall wrote:
> On Wed, May 03, 2017 at 06:08:58PM +0200, Auger Eric wrote:
>> Hi Christoffer,
>>
>> On 30/04/2017 22:14, Christoffer Dall wrote:
>>> On Fri, Apr 14, 2017 at 12:15:31PM +0200, Eric Auger wrote:
>>>> Introduce routines to save and restore device ITT and their
>>>> interrupt table entries (ITE).
>>>>
>>>> The routines will be called on device table save and
>>>> restore. They will become static in subsequent patches.
>>>
>>> Why this bottom-up approach? Couldn't you start by having the patch
>>> that restores the device table and define the static functions that
>>> return an error there
>> done
>> , and then fill them in with subsequent patches
>>> (liek this one)?
>>>
>>> That would have the added benefit of being able to tell how things are
>>> designed to be called.
>>>
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger at redhat.com>
>>>>
>>>> ---
>>>> v4 -> v5:
>>>> - ITE are now sorted by eventid on the flush
>>>> - rename *flush* into *save*
>>>> - use macros for shits and masks
>>>> - pass ite_esz to vgic_its_save_ite
>>>>
>>>> v3 -> v4:
>>>> - lookup_table and compute_next_eventid_offset become static in this
>>>> patch
>>>> - remove static along with vgic_its_flush/restore_itt to avoid
>>>> compilation warnings
>>>> - next field only computed with a shift (mask removed)
>>>> - handle the case where the last element has not been found
>>>>
>>>> v2 -> v3:
>>>> - add return 0 in vgic_its_restore_ite (was in subsequent patch)
>>>>
>>>> v2: creation
>>>> ---
>>>> virt/kvm/arm/vgic/vgic-its.c | 128 ++++++++++++++++++++++++++++++++++++++++++-
>>>> virt/kvm/arm/vgic/vgic.h | 4 ++
>>>> 2 files changed, 129 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>> index 35b2ca1..b02fc3f 100644
>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>> @@ -23,6 +23,7 @@
>>>> #include <linux/interrupt.h>
>>>> #include <linux/list.h>
>>>> #include <linux/uaccess.h>
>>>> +#include <linux/list_sort.h>
>>>>
>>>> #include <linux/irqchip/arm-gic-v3.h>
>>>>
>>>> @@ -1695,7 +1696,7 @@ u32 compute_next_devid_offset(struct list_head *h, struct its_device *dev)
>>>> return min_t(u32, next_offset, VITS_DTE_MAX_DEVID_OFFSET);
>>>> }
>>>>
>>>> -u32 compute_next_eventid_offset(struct list_head *h, struct its_ite *ite)
>>>> +static u32 compute_next_eventid_offset(struct list_head *h, struct its_ite *ite)
>>>> {
>>>> struct list_head *e = &ite->ite_list;
>>>> struct its_ite *next;
>>>> @@ -1737,8 +1738,8 @@ typedef int (*entry_fn_t)(struct vgic_its *its, u32 id, void *entry,
>>>> *
>>>> * Return: < 0 on error, 1 if last element identified, 0 otherwise
>>>> */
>>>> -int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>> - int start_id, entry_fn_t fn, void *opaque)
>>>> +static int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>> + int start_id, entry_fn_t fn, void *opaque)
>>>> {
>>>> void *entry = kzalloc(esz, GFP_KERNEL);
>>>> struct kvm *kvm = its->dev->kvm;
>>>> @@ -1773,6 +1774,127 @@ int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>> }
>>>>
>>>> /**
>>>> + * vgic_its_save_ite - Save an interrupt translation entry at @gpa
>>>> + */
>>>> +static int vgic_its_save_ite(struct vgic_its *its, struct its_device *dev,
>>>> + struct its_ite *ite, gpa_t gpa, int ite_esz)
>>>> +{
>>>> + struct kvm *kvm = its->dev->kvm;
>>>> + u32 next_offset;
>>>> + u64 val;
>>>> +
>>>> + next_offset = compute_next_eventid_offset(&dev->itt_head, ite);
>>>> + val = ((u64)next_offset << KVM_ITS_ITE_NEXT_SHIFT) |
>>>> + ((u64)ite->lpi << KVM_ITS_ITE_PINTID_SHIFT) |
>>>> + ite->collection->collection_id;
>>>> + val = cpu_to_le64(val);
>>>> + return kvm_write_guest(kvm, gpa, &val, ite_esz);
>>>> +}
>>>> +
>>>> +/**
>>>> + * vgic_its_restore_ite - restore an interrupt translation entry
>>>> + * @event_id: id used for indexing
>>>> + * @ptr: pointer to the ITE entry
>>>> + * @opaque: pointer to the its_device
>>>> + * @next: id offset to the next entry
>>>> + */
>>>> +static int vgic_its_restore_ite(struct vgic_its *its, u32 event_id,
>>>> + void *ptr, void *opaque, u32 *next)
>>>> +{
>>>> + struct its_device *dev = (struct its_device *)opaque;
>>>> + struct its_collection *collection;
>>>> + struct kvm *kvm = its->dev->kvm;
>>>> + u64 val, *p = (u64 *)ptr;
>>>
>>> nit: initializations on separate line (and possible do that just above
>>> assigning val).
>> done
>>>
>>>> + struct vgic_irq *irq;
>>>> + u32 coll_id, lpi_id;
>>>> + struct its_ite *ite;
>>>> + int ret;
>>>> +
>>>> + val = *p;
>>>> + *next = 1;
>>>> +
>>>> + val = le64_to_cpu(val);
>>>> +
>>>> + coll_id = val & KVM_ITS_ITE_ICID_MASK;
>>>> + lpi_id = (val & KVM_ITS_ITE_PINTID_MASK) >> KVM_ITS_ITE_PINTID_SHIFT;
>>>> +
>>>> + if (!lpi_id)
>>>> + return 0;
>>>
>>> are all non-zero LPI IDs valid? Don't we have a wrapper that tests if
>>> the ID is valid?
>> no, lpi_id must be >= GIC_MIN_LPI=8192; added that check.
>> ABI Doc says lpi_id==0 is interpreted as invalid. Other values <
>> GIC_MIN_LPI cause an -EINVAL error
>>>
>>> (looks like it's possible to add LPIs with the INTID range of SPIs, SGIs
>>> and PPIs here)
>>
>>>
>>>> +
>>>> + *next = val >> KVM_ITS_ITE_NEXT_SHIFT;
>>>
>>> Don't we need to validate this somehow since it will presumably be used
>>> to forward a pointer somehow by the caller?
>> checked against max number of eventids supported by the device
>>>
>>>> +
>>>> + collection = find_collection(its, coll_id);
>>>> + if (!collection)
>>>> + return -EINVAL;
>>>> +
>>>> + ret = vgic_its_alloc_ite(dev, &ite, collection,
>>>> + lpi_id, event_id);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + irq = vgic_add_lpi(kvm, lpi_id);
>>>> + if (IS_ERR(irq))
>>>> + return PTR_ERR(irq);
>>>> + ite->irq = irq;
>>>> +
>>>> + /* restore the configuration of the LPI */
>>>> + ret = update_lpi_config(kvm, irq, NULL);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + update_affinity_ite(kvm, ite);
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int vgic_its_ite_cmp(void *priv, struct list_head *a,
>>>> + struct list_head *b)
>>>> +{
>>>> + struct its_ite *itea = container_of(a, struct its_ite, ite_list);
>>>> + struct its_ite *iteb = container_of(b, struct its_ite, ite_list);
>>>> +
>>>> + if (itea->event_id < iteb->event_id)
>>>> + return -1;
>>>> + else
>>>> + return 1;
>>>> +}
>>>> +
>>>> +int vgic_its_save_itt(struct vgic_its *its, struct its_device *device)
>>>> +{
>>>> + const struct vgic_its_abi *abi = vgic_its_get_abi(its);
>>>> + gpa_t base = device->itt_addr;
>>>> + struct its_ite *ite;
>>>> + int ret, ite_esz = abi->ite_esz;
>>>
>>> nit: initializations on separate line
>> OK
>>>
>>>> +
>>>> + list_sort(NULL, &device->itt_head, vgic_its_ite_cmp);
>>>> +
>>>> + list_for_each_entry(ite, &device->itt_head, ite_list) {
>>>> + gpa_t gpa = base + ite->event_id * ite_esz;
>>>> +
>>>> + ret = vgic_its_save_ite(its, device, ite, gpa, ite_esz);
>>>> + if (ret)
>>>> + return ret;
>>>> + }
>>>> + return 0;
>>>> +}
>>>> +
>>>> +int vgic_its_restore_itt(struct vgic_its *its, struct its_device *dev)
>>>> +{
>>>> + const struct vgic_its_abi *abi = vgic_its_get_abi(its);
>>>> + gpa_t base = dev->itt_addr;
>>>> + int ret, ite_esz = abi->ite_esz;
>>>> + size_t max_size = BIT_ULL(dev->nb_eventid_bits) * ite_esz;
>>>
>>> nit: initializations on separate line
>> OK
>>>
>>>> +
>>>> + ret = lookup_table(its, base, max_size, ite_esz, 0,
>>>> + vgic_its_restore_ite, dev);
>>>
>>> nit: extra white space
>>>
>>>> +
>>>> + if (ret < 0)
>>>> + return ret;
>>>> +
>>>> + /* if the last element has not been found we are in trouble */
>>>> + return ret ? 0 : -EINVAL;
>>>
>>> hmm, these are values potentially created by the guest in guest RAM,
>>> right? So do we really abort migration and return an error to userspace
>>> in this case?
>> So we discussed with Peter/dave we shouldn't abort() in qemu in case of
>> such error. The restore table IOCTL will return an error. Up to qemu to
>> print the error. Destination guest will not be functional though.
>>
>
> ok, I'm just wondering if userspace can make a qualified decision based
> on this error code. EINVAL typically means that userspace provided
> something incorrect, which I suppose in a sense is true, but this should
> be the only case where we return EINVAL here.
Userspace must be able to
> tell the cases apart where the guest programmed bogus into memory before
> migration started, in which case we should ignore-and-resume, and where
> QEMU errornously provide some bogus value where the machine state
> becomes unreliable and must be powered down.
guest does not feed much besides few registers the ITS table restore
depends on. In case we want a more subtle error management at userspace
level all the error codes need to be revisited I am afraid. My plan was
to be more rough at the beginning and ignore & resume if ITS table
restore fails.
Thanks
Eric
>
> Thanks,
> -Christoffer
>
More information about the linux-arm-kernel
mailing list