[PATCH v4 01/22] KVM: arm/arm64: Add vITS save/restore API documentation

Christoffer Dall cdall at linaro.org
Fri Apr 21 05:12:18 EDT 2017


On Mon, Apr 10, 2017 at 04:26:14PM +0200, Auger Eric wrote:
> Hi Christoffer,
> 
> On 08/04/2017 20:17, Christoffer Dall wrote:
> > Hi Eric,
> > 
> > Most of my comments below are just me being picky about text when
> > defining a user space ABI, so I think this mainly looks good, but just
> > needs a bit of clarify, except the versioning aspect and exporting the
> > pending table vie the redistributor instead, as Marc and Andre have
> > pointed out.
> > 
> > On Mon, Mar 27, 2017 at 11:30:51AM +0200, Eric Auger wrote:
> >> Add description for how to access vITS registers and how to flush/restore
> >> vITS tables into/from memory
> >>
> >> Signed-off-by: Eric Auger <eric.auger at redhat.com>
> >>
> >> ---
> >> v3 -> v4:
> >> - take into account Peter's comments:
> >>   - typos
> >>   - KVM_DEV_ARM_VGIC_GRP_ITS_TABLES kvm_device_attr = 0
> >>   - add a validity bit in DTE
> >>   - document all fields in CTE and ITE
> >>   - document ABI revision
> >> - take into account Andre's comments:
> >>   - document restrictions about GITS_CREADR writing and GITS_IIDR
> >>   - document -EBUSY error if one or more VCPUS are runnning
> >>   - document 64b registers only can be accessed with 64b access
> >> - itt_addr field matches bits [51:8] of the itt_addr
> >>
> >> v1 -> v2:
> >> - DTE and ITE now are 8 bytes
> >> - DTE and ITE now indexed by deviceid/eventid
> >> - use ITE name instead of ITTE
> >> - mentions ITT_addr matches bits [51:8] of the actual address
> >> - mentions LE layout
> >> ---
> >>  Documentation/virtual/kvm/devices/arm-vgic-its.txt | 118 +++++++++++++++++++++
> >>  1 file changed, 118 insertions(+)
> >>
> >> diff --git a/Documentation/virtual/kvm/devices/arm-vgic-its.txt b/Documentation/virtual/kvm/devices/arm-vgic-its.txt
> >> index 6081a5b..0902d20 100644
> >> --- a/Documentation/virtual/kvm/devices/arm-vgic-its.txt
> >> +++ b/Documentation/virtual/kvm/devices/arm-vgic-its.txt
> >> @@ -36,3 +36,121 @@ Groups:
> >>      -ENXIO:  ITS not properly configured as required prior to setting
> >>               this attribute
> >>      -ENOMEM: Memory shortage when allocating ITS internal data
> >> +
> >> +  KVM_DEV_ARM_VGIC_GRP_ITS_REGS
> >> +  Attributes:
> >> +      The attr field of kvm_device_attr encodes the offset of the
> >> +      ITS register, relative to the ITS control frame base address
> >> +      (ITS_base).
> >> +
> >> +      kvm_device_attr.addr points to a __u64 value whatever the width
> >> +      of the addressed register (32/64 bits). 64 bit registers can only
> >> +      be accessed with full length.
> >> +
> >> +      Writes to read-only registers are ignored by the kernel except for:
> >> +      - GITS_READR. It needs to be restored otherwise commands in the queue
> >> +        will be re-executed after CWRITER setting. Writing this register is
> >                                      ^^^^^^^^^^^^^^^
> > 
> > "after restoring CWRITER." ?
> OK
> > 
> >> +        allowed if the ITS is not enabled (GITS_CTLR.enable = 0). Also it
> > 
> > Does that mean that you can only save/restore a disabled ITS or does it
> > mean that initially after creating the ITS it is disabled and userspace
> > should restore the CWRITER before restoring GITS_CTLR (which may enable
> > the ITS) ?
> No I meant the second. As normally the ITS is responsible for updating
> the GITS_READR, if the userspace plays with it while the ITS is enabled
> this can mess everything. So the userspace is supposed to restore the
> GITS_READR *before* restoring the GITS_CTLR which is likely to enable
> the ITS.
> > 
> >> +        needs to be restored after GITS_CBASER since a write to GITS_CBASER
> >> +        resets GITS_CREADR.
> >> +      - GITS_IIDR. Its Revision field encodes the table layout ABI revision.
> > 
> > How is this really going to work?  Your ABI here must be backwards
> > compatible for future revisions, so what is userspace supposed to do,
> > when it reads a newer revision than it was programmed for?
> destination ABI revision must be >= source ABI revision
> By restoring the IIDR value on the destination side, the userspace
> informs destination KVM about the layout of the table. If the
> destination KVM does not support this ABI revision, the restoration of
> the ITS table fails. Is that wrong?
> 

No, that's fine.  I think my confusion was that I didn't appreciate that
the revision thing is for the table layouts only, and not the other
calls here.

> > 
> > I think we need a more clear description of how the revision is going to
> > be used, such that each operation on the ITS here is described as
> > requiring a minimum revision X, and making sure that userspace can
> > safely ignore things that are of a higher revision number, while at the
> > same time userspace can decide not to use newer features with older
> > kernels.
> 
> At the moment I did not plan to implement any way for the userspace to
> force the ITS state save in a specified ABI revision. Today it uses the
> higher revision. Let's say that ABI V2 supports GICv4, save of source
> ITS with be done in v2 layout. A destination only supporting GICv3 (ABI
> v1) won't be able to read that format. Does it make sense?
> 

That makes sense, yes.

> > 
> >> +
> >> +      For other registers, getting or setting a register has the same
> >> +      effect as reading/writing the register on real hardware.
> >> +  Errors:
> >> +    -ENXIO: Offset does not correspond to any supported register
> >> +    -EFAULT: Invalid user pointer for attr->addr
> >> +    -EINVAL: Offset is not 64-bit aligned
> >> +    -EBUSY: one or more VCPUS are running
> >> +
> >> +  KVM_DEV_ARM_VGIC_GRP_ITS_TABLES
> > 
> > Shouldn't this be called KVM_DEV_ARM_VGIC_GRP_FLUSH_ITS_TABLES ?
> At the moment this group is used for flush and restore. But as you have
> the same remark as Andre I guess it is worth having separate commands in
> KVM_DEV_ARM_VGIC_GRP_CTRL then.
> > 
> >> +  Attributes
> >> +       The attr field of kvm_device_attr must be zero.
> >> +
> >> +       request the flush-save/restore of the ITS tables, namely
> > 
> > Nit: Request (upper case R)
> OK
> > 
> > what does flush-save/restore mean as opposed to just flush?
> meant flush/restore
> > 
> >> +       the device table, the collection table, all the ITT tables,
> >> +       the LPI pending tables. On save, the tables are flushed
> > 
> > , and the LPI pending table.
> OK
> > 
> >> +       into guest memory at the location provisioned by the guest
> >> +       in GITS_BASER (device and collection tables), in the MAPD
> >> +       command (ITT_addr), GICR_PENDBASERs (pending tables).
> >> +
> >> +       This means the GIC should be restored before the ITS and all
> > 
> > should or must?  Is this enforced?
> must. yes it is.
> > 
> >> +       ITS registers but the GITS_CTLR must be restored before
> >> +       restoring the ITS tables.
> >> +
> >> +       The GITS_READR and GITS_IIDR read-only registers must also
> >> +       be restored before the table restore. The IIDR revision field
> >> +       encodes the ABI revision of the table layout. If not set by
> >> +       user space, the restoration of the tables will fail.
> > 
> > consider rewording: ", restoring the ITS tables will fail."
> OK
> > 
> >> +
> >> +       Note the LPI configuration table is read-only for the
> > 
> > Note that
> OK
> > 
> >> +       in-kernel ITS and its save/restore goes through the standard
> > 
> > and saving/restoring it is done via the normal process to save/restore
> > guest RAM.
> OK
> > 
> >> +       RAM save/restore.
> >> +
> >> +       The layout of the tables in guest memory defines an ABI.
> >> +       The entries are laid in little endian format as follows;
> > 
> > s/;/:/
> OK
> > 
> > It's a bit weird to say "as follows:" and then proceed with the error
> > descriptions.  I would simply say "as described in the following
> > paragraph."
> OK
> > 
> >> +
> >> +  Errors:
> >> +    -EINVAL: kvm_device_attr not equal to 0, invalid table data
> >> +    -EFAULT: invalid guest ram access
> >> +    -EBUSY: one or more VCPUS are running
> >> +
> >> +    ITS Table ABI REV1:
> >> +    -------------------
> >> +
> >> +    The device table and ITE are respectively indexed by device id and
> > 
> > s/ITE/ITT/
> right
> > 
> > are indexed by the device id and eventid, respectively.
> > 
> >> +    eventid. The collection table however is not indexed by collection id:
> > 
> > ...by collection id, instead all the CTEs are written...
> > 
> >> +    CTE are written at the beginning of the buffer.
> > 
> > in any particular order, or?
> no order or creation

Probably good to point that out.

> > 
> >> +
> >> +    Device Table Entry (DTE) layout: entry size = 8 bytes
> >> +
> >> +    bits:     | 63| 62 ... 49 | 48 ... 5 | 4 ... 0 |
> >> +    values:   | V |   next    | ITT_addr |  Size   |
> >> +
> >> +    where;
> >> +    - V indicates whether the entry is valid,
> >> +    - ITT_addr matches bits [51:8] of the ITT address (256B aligned),
> >> +    - next field is meaningful only if the entry is valid.
> > 
> > is the ITT_addr meaningful if the entry is not valid?
> no, I will add this detail.
> > 
> >> +    It equals to 0 if this entry is the last one; otherwise it corresponds
> >> +    to the minimum between the offset to the next device id and 2^14 -1.
> > 
> > I don't easily understand this last paragraph and the indentation is
> > weird and makes it look like it's not explaining the next field.
> 
> reworded:
> It equals to 0 if this entry is the last one; otherwise it corresponds
> to the the deviceid offset to the next DTE, capped by 2^14 -1.
> > 
> > You're missing a description of the size field.  Size in what unit?
> > Size of what?
> right, added "size matches the MAPD Size semantic."
> > 
> > 
> >> +
> >> +    Collection Table Entry (CTE) layout: entry size = 8 bytes
> >> +
> >> +    bits:     | 63| 62 ..  52  | 51 ... 16 | 15  ...   0 |
> >> +    values:   | V |    RES0    |  RDBase   |    ICID     |
> >> +
> >> +    where:
> >> +    - V indicates whether the entry is valid,
> > 
> > Do we explain anywhere what RES0 means?
> Added RES0: reserved field with Should-Be-Zero-or-Preserved behavior.
> > 
> >> +    - RDBase matches the PE number (GICR_TYPER.Processor_Number),
> > 
> > is 'matches' the right verb to use here?
> is
> > 
> > What exactly is the format of GICR_TYPER.Processor_Number ?
> > 
> >> +    - ICID matches the collection ID
> > 
> > again, is 'matches' the right verb to use here?
> is
> > 
> >> +
> >> +    Interrupt Translation Entry (ITE) layout: entry size = 8 bytes
> >> +
> >> +    bits:     | 63 ... 48 | 47 ... 16 | 15 ... 0 |
> >> +    values:   |    next   |   pINTID  |  ICID    |
> >> +
> >> +    where:
> >> +    - pINTID is the physical LPI ID,
> >> +    - ICID is the collection ID,
> > 
> > here you use 'is' intead of 'matches'.  At leat be consistent.
> is
> > 
> >> +    - next field is meaningful only if the entry is valid (pINTID != 0).
> >> +    It equals to 0 if this entry is the last one; otherwise it corresponds
> >> +    to the minimum between the eventid offset to the next ITE and 2^16 -1.
> > 
> > same comments, as above.
> ok
> > 
> > also, can you list the field in the order they appear?
> ok
> > 
> >> +
> >> +    LPI Pending Table layout:
> >> +
> >> +    As specified in the ARM Generic Interrupt Controller Architecture
> >> +    Specification GIC Architecture version 3.0 and version 4. The first
> > 
> > "...version 4, the first".  ("As specified in X." is not a sentence).
> > 
> >> +    1kB is not modified and therefore should contain zeroes.
> > 
> > should or must?  or always contains zeros?  Will you return an error if
> > userspace puts something non-zero in there?
> no I don't. Spec says it should initialized to zero. $
> > 
> >> +
> >> +    Future evolutions of the ITS table layout:
> >> +
> >> +    At the moment the table layout is defined and optimized for physical
> >> +    LPI support.
> > 
> > This comment is a bit confusing, because this is all about virtual
> > interrupts really, so 'physical from the point of view of the VM', but I
> > think you should just drop this sentence.
> OK
> > 
> >> +
> >> +    In the future we might implement direct injection of virtual LPIS.
> > 
> > For nesting, yes?  (on the host this should not be visible here, should
> > it?)
> yes for nesting.
> > 
> >> +    This will require an upgrade of the table layout and an evolution of
> >> +    the ABI. The ABI revision is encoded in the GITS_IIDR revision field.
> >> +    That register must be restored before the table restoration, otherwise
> >> +    the operation will fail.
> > 
> > Hmm, I thought we dealt with that before, feels a bit out of place.
> 
> OK moved to the register description.
> > 
> >> +
> >> +    ABI V1: GITS_IIDR.Revision = 1
> > 
> > I feel like this should go in the ITS register description of the IIDR.
> > Most likely, I think this particular line can be dropped, but all other
> > definitions in this file can be annoted with a minimum revision number
> > ensuring that future revisions implement this, plus potentially more
> > stuff.
> I already put the following title, "ITS Table ABI REV1" for the table
> layout description. My intent was to create a new chapter per revision.
> Is it OK?
> 

I'll have a look at the next revision.

Thanks for responding to all my comments.

-Christoffer



More information about the linux-arm-kernel mailing list