[PATCH v2 11/15] KVM: arm64: handle pending bit for LPIs in ITS emulation
Eric Auger
eric.auger at linaro.org
Mon Aug 31 02:45:24 PDT 2015
On 08/25/2015 04:34 PM, Andre Przywara wrote:
> Hi Eric,
>
> On 14/08/15 12:58, Eric Auger wrote:
>> On 07/10/2015 04:21 PM, Andre Przywara wrote:
>>> As the actual LPI number in a guest can be quite high, but is mostly
>>> assigned using a very sparse allocation scheme, bitmaps and arrays
>>> for storing the virtual interrupt status are a waste of memory.
>>> We use our equivalent of the "Interrupt Translation Table Entry"
>>> (ITTE) to hold this extra status information for a virtual LPI.
>>> As the normal VGIC code cannot use it's fancy bitmaps to manage
>>> pending interrupts, we provide a hook in the VGIC code to let the
>>> ITS emulation handle the list register queueing itself.
>>> LPIs are located in a separate number range (>=8192), so
>>> distinguishing them is easy. With LPIs being only edge-triggered, we
>>> get away with a less complex IRQ handling.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara at arm.com>
>>> ---
>>> include/kvm/arm_vgic.h | 2 ++
>>> virt/kvm/arm/its-emul.c | 71 ++++++++++++++++++++++++++++++++++++++++++++
>>> virt/kvm/arm/its-emul.h | 3 ++
>>> virt/kvm/arm/vgic-v3-emul.c | 2 ++
>>> virt/kvm/arm/vgic.c | 72 ++++++++++++++++++++++++++++++++++-----------
>>> 5 files changed, 133 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 1648668..2a67a10 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -147,6 +147,8 @@ struct vgic_vm_ops {
>>> int (*init_model)(struct kvm *);
>>> void (*destroy_model)(struct kvm *);
>>> int (*map_resources)(struct kvm *, const struct vgic_params *);
>>> + bool (*queue_lpis)(struct kvm_vcpu *);
>>> + void (*unqueue_lpi)(struct kvm_vcpu *, int irq);
>>> };
>>>
>>> struct vgic_io_device {
>>> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
>>> index 7f217fa..b9c40d7 100644
>>> --- a/virt/kvm/arm/its-emul.c
>>> +++ b/virt/kvm/arm/its-emul.c
>>> @@ -50,8 +50,26 @@ struct its_itte {
>>> struct its_collection *collection;
>>> u32 lpi;
>>> u32 event_id;
>>> + bool enabled;
>>> + unsigned long *pending;
>>> };
>>>
>>> +#define for_each_lpi(dev, itte, kvm) \
>>> + list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
>>> + list_for_each_entry(itte, &(dev)->itt, itte_list)
>>> +
>> You have a checkpatch error here:
>>
>> ERROR: Macros with complex values should be enclosed in parentheses
>> #52: FILE: virt/kvm/arm/its-emul.c:57:
>> +#define for_each_lpi(dev, itte, kvm) \
>> + list_for_each_entry(dev, &(kvm)->arch.vgic.its.device_list, dev_list) \
>> + list_for_each_entry(itte, &(dev)->itt, itte_list)
>
> I know about that one. The problem is that if I add the parentheses it
> breaks the usage below due to the curly brackets. But the definition
> above is just so convenient and I couldn't find another neat solution so
> far. If you are concerned about that I can give it another try,
> otherwise I tend to just ignore checkpatch here.
OK maybe you can add a comment then?
>
>>> +static struct its_itte *find_itte_by_lpi(struct kvm *kvm, int lpi)
>>> +{
>> can't we have the same LPI present in different interrupt translation
>> tables? I don't know it is a sensible setting but I did not succeed in
>> finding it was not possible.
>
> Thanks to Marc I am happy (and relieved!) to point you to 6.1.1 LPI INTIDs:
> "The behavior of the GIC is UNPREDICTABLE if software:
> - Maps multiple EventID/DeviceID combinations to the same physical LPI
> INTID."
>
> So I exercise the freedom of UNPREDICTABLE here ;-)
OK
>
>>> + struct its_device *device;
>>> + struct its_itte *itte;
>>> +
>>> + for_each_lpi(device, itte, kvm) {
>>> + if (itte->lpi == lpi)
>>> + return itte;
>>> + }
>>> + return NULL;
>>> +}
>>> +
>>> #define BASER_BASE_ADDRESS(x) ((x) & 0xfffffffff000ULL)
>>>
>>> /* The distributor lock is held by the VGIC MMIO handler. */
>>> @@ -145,6 +163,59 @@ static bool handle_mmio_gits_idregs(struct kvm_vcpu *vcpu,
>>> return false;
>>> }
>>>
>>> +/*
>>> + * Find all enabled and pending LPIs and queue them into the list
>>> + * registers.
>>> + * The dist lock is held by the caller.
>>> + */
>>> +bool vits_queue_lpis(struct kvm_vcpu *vcpu)
>>> +{
>>> + struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
>>> + struct its_device *device;
>>> + struct its_itte *itte;
>>> + bool ret = true;
>>> +
>>> + if (!vgic_has_its(vcpu->kvm))
>>> + return true;
>>> + if (!its->enabled || !vcpu->kvm->arch.vgic.lpis_enabled)
>>> + return true;
>>> +
>>> + spin_lock(&its->lock);
>>> + for_each_lpi(device, itte, vcpu->kvm) {
>>> + if (!itte->enabled || !test_bit(vcpu->vcpu_id, itte->pending))
>>> + continue;
>>> +
>>> + if (!itte->collection)
>>> + continue;
>>> +
>>> + if (itte->collection->target_addr != vcpu->vcpu_id)
>>> + continue;
>>> +
>>> + __clear_bit(vcpu->vcpu_id, itte->pending);
>>> +
>>> + ret &= vgic_queue_irq(vcpu, 0, itte->lpi);
>> what if the vgic_queue_irq fails since no LR can be found, the
>> itte->pending was cleared so we forget that LPI? shouldn't we restore
>> the pending state in ITT? in vgic_queue_hwirq the state change only is
>> performed if the vgic_queue_irq succeeds
>
> Of course you are right. I will just only clear the bit if the call
> succeeds.
>
>>> + }
>>> +
>>> + spin_unlock(&its->lock);
>>> + return ret;
>>> +}
>>> +
>>> +/* Called with the distributor lock held by the caller. */
>>> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int lpi)
>> I was a bit confused by the name of the function, with regard to
>> existing vgic_unqueue_irqs which restores the states in accordance to
>> what we have in LR. Wouldn't it make sense to call it
>> vits_lpi_set_pending(vcpu, lpi) or something that looks more similar to
>> vgic_dist_irq_set_pending setter which I think it mirrors.
>
> Well, vgic_unqueue_irqs() "move[s] pending/active IRQs from LRs to the
> distributor", this is what vits_unqueue_lpi() also does, just for one
> _single_ LPI instead of iterating over all LRs. Originally I planned to
> call vgic_unqueue_irqs on every guest exit, so this would have a more
> obvious match.
> Admittedly the function does not do much "unqueueing", as this is done
> in the caller, so I will think about the renaming part.
>
>>> +{
>>> + struct vgic_its *its = &vcpu->kvm->arch.vgic.its;
>>> + struct its_itte *itte;
>>> +
>>> + spin_lock(&its->lock);
>>> +
>>> + /* Find the right ITTE and put the pending state back in there */
>>> + itte = find_itte_by_lpi(vcpu->kvm, lpi);
>>> + if (itte)
>>> + __set_bit(vcpu->vcpu_id, itte->pending);
>>> +
>>> + spin_unlock(&its->lock);
>>> +}
>>> +
>>> static int vits_handle_command(struct kvm_vcpu *vcpu, u64 *its_cmd)
>>> {
>>> return -ENODEV;
>>> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
>>> index 472a6d0..cc5d5ff 100644
>>> --- a/virt/kvm/arm/its-emul.h
>>> +++ b/virt/kvm/arm/its-emul.h
>>> @@ -33,4 +33,7 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>>> int vits_init(struct kvm *kvm);
>>> void vits_destroy(struct kvm *kvm);
>>>
>>> +bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>>> +void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
>>> +
>>> #endif
>>> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
>>> index 49be3c3..4132c26 100644
>>> --- a/virt/kvm/arm/vgic-v3-emul.c
>>> +++ b/virt/kvm/arm/vgic-v3-emul.c
>>> @@ -948,6 +948,8 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>>> dist->vm_ops.init_model = vgic_v3_init_model;
>>> dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>>> dist->vm_ops.map_resources = vgic_v3_map_resources;
>>> + dist->vm_ops.queue_lpis = vits_queue_lpis;
>>> + dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
>>>
>>> dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>>> dist->vgic_redist_base = VGIC_ADDR_UNDEF;
>>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>>> index 49ee92b..9dfd094 100644
>>> --- a/virt/kvm/arm/vgic.c
>>> +++ b/virt/kvm/arm/vgic.c
>>> @@ -95,6 +95,20 @@ static bool queue_sgi(struct kvm_vcpu *vcpu, int irq)
>>> return vcpu->kvm->arch.vgic.vm_ops.queue_sgi(vcpu, irq);
>>> }
>>>
>>> +static bool vgic_queue_lpis(struct kvm_vcpu *vcpu)
>>> +{
>>> + if (vcpu->kvm->arch.vgic.vm_ops.queue_lpis)
>>> + return vcpu->kvm->arch.vgic.vm_ops.queue_lpis(vcpu);
>>> + else
>>> + return true;
>>> +}
>>> +
>>> +static void vgic_unqueue_lpi(struct kvm_vcpu *vcpu, int irq)
>>> +{
>>> + if (vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi)
>>> + vcpu->kvm->arch.vgic.vm_ops.unqueue_lpi(vcpu, irq);
>>> +}
>>> +
>>> int kvm_vgic_map_resources(struct kvm *kvm)
>>> {
>>> return kvm->arch.vgic.vm_ops.map_resources(kvm, vgic);
>>> @@ -1135,6 +1149,10 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
>>> for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
>>> vlr = vgic_get_lr(vcpu, lr);
>>>
>>> + /* We don't care about LPIs here */
>>> + if (vlr.irq >= 8192)
>>> + continue;
>>> +
>>> if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
>>> vlr.state = 0;
>>> vgic_set_lr(vcpu, lr, vlr);
>>> @@ -1147,25 +1165,33 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
>>> static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>> int lr_nr, int sgi_source_id)
>>> {
>>> + struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>> struct vgic_lr vlr;
>>>
>>> vlr.state = 0;
>>> vlr.irq = irq;
>>> vlr.source = sgi_source_id;
>>>
>>> - if (vgic_irq_is_active(vcpu, irq)) {
>>> - vlr.state |= LR_STATE_ACTIVE;
>>> - kvm_debug("Set active, clear distributor: 0x%x\n", vlr.state);
>>> - vgic_irq_clear_active(vcpu, irq);
>>> - vgic_update_state(vcpu->kvm);
>>> - } else if (vgic_dist_irq_is_pending(vcpu, irq)) {
>>> - vlr.state |= LR_STATE_PENDING;
>>> - kvm_debug("Set pending: 0x%x\n", vlr.state);
>>> - }
>>> -
>>> - if (!vgic_irq_is_edge(vcpu, irq))
>>> - vlr.state |= LR_EOI_INT;
>>> + /* We care only about state for SGIs/PPIs/SPIs, not for LPIs */
>>> + if (irq < dist->nr_irqs) {
>>> + if (vgic_irq_is_active(vcpu, irq)) {
>>> + vlr.state |= LR_STATE_ACTIVE;
>>> + kvm_debug("Set active, clear distributor: 0x%x\n",
>>> + vlr.state);
>>> + vgic_irq_clear_active(vcpu, irq);
>>> + vgic_update_state(vcpu->kvm);
>>> + } else if (vgic_dist_irq_is_pending(vcpu, irq)) {
>>> + vlr.state |= LR_STATE_PENDING;
>>> + kvm_debug("Set pending: 0x%x\n", vlr.state);
>>> + }
>>>
>>> + if (!vgic_irq_is_edge(vcpu, irq))
>>> + vlr.state |= LR_EOI_INT;
>>> + } else {
>>> + /* If this is an LPI, it can only be pending */
>>> + if (irq >= 8192)
>>> + vlr.state |= LR_STATE_PENDING;
>>> + }
>>> vgic_set_lr(vcpu, lr_nr, vlr);
>>> vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
>>> }
>>> @@ -1177,7 +1203,6 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
>>> */
>>> bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>>> {
>>> - struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>> u64 elrsr = vgic_get_elrsr(vcpu);
>>> unsigned long *elrsr_ptr = u64_to_bitmask(&elrsr);
>>> int lr;
>>> @@ -1185,7 +1210,6 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
>>> /* Sanitize the input... */
>>> BUG_ON(sgi_source_id & ~7);
>>> BUG_ON(sgi_source_id && irq >= VGIC_NR_SGIS);
>>> - BUG_ON(irq >= dist->nr_irqs);
>> Is it safe to remove that check. What if it is attempted to inject an
>> SPI larger than supported. I think you should refine the check but not
>> remove it.
>
> The check is now in vgic_queue_irq_to_lr (see above), where we
> differentiate between LPIs (>=8192) and SPIs (<dist->nr_irqs). Please
> correct me if I am wrong on this, but since we are not setting any state
> bits the vgic_set_lr() and vgic_sync_lr_elrsr() calls should be NOPs in
> this case, right?
Looks OK indeed.
Cheers
Eric
> I may re-add the explicit check here for the sake of clarity, though.
>
>>>
>>> kvm_debug("Queue IRQ%d\n", irq);
>>>
>>> @@ -1265,8 +1289,12 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
>>> overflow = 1;
>>> }
>>>
>>> -
>>> -
>>> + /*
>>> + * LPIs are not mapped in our bitmaps, so we leave the iteration
>>> + * to the ITS emulation code.
>>> + */
>>> + if (!vgic_queue_lpis(vcpu))
>>> + overflow = 1;
>>>
>>> epilog:
>>> if (overflow) {
>>> @@ -1387,6 +1415,16 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>> for_each_clear_bit(lr_nr, elrsr_ptr, vgic_cpu->nr_lr) {
>>> vlr = vgic_get_lr(vcpu, lr_nr);
>>>
>>> + /* LPIs are handled separately */
>>> + if (vlr.irq >= 8192) {
>>> + /* We just need to take care about still pending LPIs */
>>> + if (vlr.state & LR_STATE_PENDING) {
>>> + vgic_unqueue_lpi(vcpu, vlr.irq);
>>> + pending = true;
>>> + }
>>> + continue;
>> don't we need to reset the LR & update elrsr?
>
> Mmmh, interesting, I just wonder how it worked before. I will move most
> of the lower part into an else clause and call the LR maintainance code
> in both cases.
>
> Cheers,
> Andre.
>
>>> + }
>>> +
>>> BUG_ON(!(vlr.state & LR_STATE_MASK));
>>> pending = true;
>>>
>>> @@ -1411,7 +1449,7 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
>>> }
>>> vgic_update_state(vcpu->kvm);
>>>
>>> - /* vgic_update_state would not cover only-active IRQs */
>>> + /* vgic_update_state would not cover only-active IRQs or LPIs */
>>> if (pending)
>>> set_bit(vcpu->vcpu_id, dist->irq_pending_on_cpu);
>>> }
>>>
>>
More information about the linux-arm-kernel
mailing list