[RFC PATCH v2 08/10] iommu/riscv: support nested iommu for flushing cache

Zong Li zong.li at sifive.com
Fri Jun 28 01:19:28 PDT 2024


On Thu, Jun 20, 2024 at 12:17 AM Jason Gunthorpe <jgg at ziepe.ca> wrote:
>
> On Fri, Jun 14, 2024 at 10:21:54PM +0800, Zong Li wrote:
> > This patch implements cache_invalidate_user operation for the userspace
> > to flush the hardware caches for a nested domain through iommufd.
> >
> > Signed-off-by: Zong Li <zong.li at sifive.com>
> > ---
> >  drivers/iommu/riscv/iommu.c  | 90 ++++++++++++++++++++++++++++++++++--
> >  include/uapi/linux/iommufd.h | 11 +++++
> >  2 files changed, 97 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
> > index 410b236e9b24..d08eb0a2939e 100644
> > --- a/drivers/iommu/riscv/iommu.c
> > +++ b/drivers/iommu/riscv/iommu.c
> > @@ -1587,8 +1587,9 @@ static int riscv_iommu_attach_dev_nested(struct iommu_domain *domain, struct dev
> >       if (riscv_iommu_bond_link(riscv_domain, dev))
> >               return -ENOMEM;
> >
> > -     riscv_iommu_iotlb_inval(riscv_domain, 0, ULONG_MAX);
> > -     info->dc_user.ta |= RISCV_IOMMU_PC_TA_V;
> > +     if (riscv_iommu_bond_link(info->domain, dev))
> > +             return -ENOMEM;
>
> ?? Is this in the wrong patch then? Confused

Yes, it should be in 7th patch in this series. I will fix it in next version.

>
> >       riscv_iommu_iodir_update(iommu, dev, &info->dc_user);
> >
> >       info->domain = riscv_domain;
> > @@ -1611,13 +1612,92 @@ static void riscv_iommu_domain_free_nested(struct iommu_domain *domain)
> >       kfree(riscv_domain);
> >  }
> >
> > +static int riscv_iommu_fix_user_cmd(struct riscv_iommu_command *cmd,
> > +                                 unsigned int pscid, unsigned int gscid)
> > +{
> > +     u32 opcode = FIELD_GET(RISCV_IOMMU_CMD_OPCODE, cmd->dword0);
> > +
> > +     switch (opcode) {
> > +     case RISCV_IOMMU_CMD_IOTINVAL_OPCODE:
> > +             u32 func = FIELD_GET(RISCV_IOMMU_CMD_FUNC, cmd->dword0);
> > +
> > +             if (func != RISCV_IOMMU_CMD_IOTINVAL_FUNC_GVMA &&
> > +                 func != RISCV_IOMMU_CMD_IOTINVAL_FUNC_VMA) {
> > +                     pr_warn("The IOTINVAL function: 0x%x is not supported\n",
> > +                             func);
> > +                     return -EOPNOTSUPP;
> > +             }
> > +
> > +             if (func == RISCV_IOMMU_CMD_IOTINVAL_FUNC_GVMA) {
> > +                     cmd->dword0 &= ~RISCV_IOMMU_CMD_FUNC;
> > +                     cmd->dword0 |= FIELD_PREP(RISCV_IOMMU_CMD_FUNC,
> > +                                               RISCV_IOMMU_CMD_IOTINVAL_FUNC_VMA);
> > +             }
> > +
> > +             cmd->dword0 &= ~(RISCV_IOMMU_CMD_IOTINVAL_PSCID |
> > +                              RISCV_IOMMU_CMD_IOTINVAL_GSCID);
> > +             riscv_iommu_cmd_inval_set_pscid(cmd, pscid);
> > +             riscv_iommu_cmd_inval_set_gscid(cmd, gscid);
> > +             break;
> > +     case RISCV_IOMMU_CMD_IODIR_OPCODE:
> > +             /*
> > +              * Ensure the device ID is right. We expect that VMM has
> > +              * transferred the device ID to host's from guest's.
> > +              */
>
> I'm not sure what this remark means, but I expect you will need to
> translate any devices IDs from virtual to physical.

I think we need some data structure to map it. I didn't do that here
because our internal implementation translates the right ID in VMM,
but as you mentioned, we can't expect that VMM will do that for
kernel.

>
> >
> >  static int
> > -riscv_iommu_get_dc_user(struct device *dev, struct iommu_hwpt_riscv_iommu *user_arg)
> > +riscv_iommu_get_dc_user(struct device *dev, struct iommu_hwpt_riscv_iommu *user_arg,
> > +                     struct riscv_iommu_domain *s1_domain)
> >  {
> >       struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >       struct riscv_iommu_device *iommu = dev_to_iommu(dev);
> > @@ -1663,6 +1743,8 @@ riscv_iommu_get_dc_user(struct device *dev, struct iommu_hwpt_riscv_iommu *user_
> >                      riscv_iommu_get_dc(iommu, fwspec->ids[i]),
> >                      sizeof(struct riscv_iommu_dc));
> >               info->dc_user.fsc = dc.fsc;
> > +             info->dc_user.ta = FIELD_PREP(RISCV_IOMMU_PC_TA_PSCID, s1_domain->pscid) |
> > +                                           RISCV_IOMMU_PC_TA_V;
> >       }
>
> It is really weird that the s1 domain has any kind of id. What is the
> PSCID? Is it analogous to VMID on ARM?

I think the VMID is closer to the GSCID. The PSCID might be more like
the ASID, as it is used as the address space ID for the process
identified by the first-stage page table.
The GSCID used to tag the G-stage TLB, the PSCID is used to tag the
single stage TLB and the tuple {GSCID, PSCID} is used to tag the
VS-stage TLB. The IOTINVAL.VMA command can flush the mapping by
matching GSCID only, PSCID only or the tuple {GSCID, PSCID}. We can
consider the situation that there are two devices pass through to a
guest, then we will have two s1 domains under the same s2 domain, and
we can flush their mapping by { GSCID, PSCID } and { GSCID, PSCID' }
respectively.

>
> Jason



More information about the linux-riscv mailing list