[RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its domain as parent
Marc Zyngier
maz at kernel.org
Thu Sep 8 07:18:53 PDT 2022
On Thu, 08 Sep 2022 10:51:01 +0100,
"Radovanovic, Aleksandar" <aleksandar.radovanovic at amd.com> wrote:
>
> [AMD Official Use Only - General]
>
>
>
> > -----Original Message-----
> > From: Marc Zyngier <maz at kernel.org>
> > Sent: 08 September 2022 09:08
> > To: Radovanovic, Aleksandar <aleksandar.radovanovic at amd.com>
> > Cc: Jason Gunthorpe <jgg at nvidia.com>; Gupta, Nipun
> > <Nipun.Gupta at amd.com>; robh+dt at kernel.org;
> > krzysztof.kozlowski+dt at linaro.org; gregkh at linuxfoundation.org;
> > rafael at kernel.org; eric.auger at redhat.com; alex.williamson at redhat.com;
> > cohuck at redhat.com; Gupta, Puneet (DCG-ENG)
> > <puneet.gupta at amd.com>; song.bao.hua at hisilicon.com;
> > mchehab+huawei at kernel.org; f.fainelli at gmail.com;
> > jeffrey.l.hugo at gmail.com; saravanak at google.com;
> > Michael.Srba at seznam.cz; mani at kernel.org; yishaih at nvidia.com;
> > robin.murphy at arm.com; will at kernel.org; joro at 8bytes.org;
> > masahiroy at kernel.org; ndesaulniers at google.com; linux-arm-
> > kernel at lists.infradead.org; linux-kbuild at vger.kernel.org; linux-
> > kernel at vger.kernel.org; devicetree at vger.kernel.org; kvm at vger.kernel.org;
> > okaya at kernel.org; Anand, Harpreet <harpreet.anand at amd.com>; Agarwal,
> > Nikhil <nikhil.agarwal at amd.com>; Simek, Michal <michal.simek at amd.com>;
> > git (AMD-Xilinx) <git at amd.com>
> > Subject: Re: [RFC PATCH v3 4/7] bus/cdx: add cdx-MSI domain with gic-its
> > domain as parent
> >
> > [CAUTION: External Email]
> >
> > OK, so you definitely need a mapping, but it cannot be a translation, and it
> > needs to be in all the possible address spaces. OMG.
>
> Could you elaborate why it needs to be in all the possible address
> spaces? I'm in no way familiar with kernel IOVA allocation, so not
> sure I understand this requirement. Note that each CDX device will
> have its own unique StreamID (in general case, equal to DeviceID
> sent to the GIC), so, from a SMMU perspective, the mapping can be
> specific to that device. As long as that IOVA is not allocated to
> any DMA region for _that_ device, things should be OK? But, I
> appreciate it might not be that simple from a kernel perspective.
Robin answered that one. This also has direct impacts on vfio and
virtualisation, as your userspace/VM cannot have any memory at the
address of the ITS (because you cannot DMA to it).
>
> > > > > As for the data part (EventID in GIC parlance), this is always
> > > > > going to be the CDX device-relative vector number - I believe this
> > > > > can't be changed, it is a hardware limitation (but I need to double-
> > check).
> > > > > That should be OK, though, as I believe this is exactly what Linux
> > > > > would write anyway, as each CDX device should be in its own IRQ
> > > > > domain (i.e. have its own ITS device table).
> > > >
> > > > But that's really the worse part. You have hardcoded what is the
> > > > *current* Linux behaviour. Things change. And baking SW behaviour
> > > > into a piece of HW looks incredibly shortsighted...
> > >
> > > For posterity, I'm not an RTL designer/architect, so share your
> > > sentiment to a certain extent. That said, I expect the decision was
> > > not based on Linux or any other SW behaviour, but because it is the
> > > most straightforward and least expensive way to do it. Having an
> > > EventID register for each and every MSI source just so you can program
> > > it in any random order costs flops and all the associated complexity
> > > of extra RTL logic (think timing closure, etc.), so trade-offs are
> > > made. The fact that it matches current Linux behaviour means we just
> > > got lucky...
> >
> > Yeah, but that's not the only problem: there is no guarantee that we have
> > enough LPIs to allocate for the device, so we'll perform a partial allocation (8
> > instead of 32 LPIs, for example).
>
> Why should that be a problem? The driver will know in advance the
> number of vectors required by the device. I expect it will need to
> call some equivalent of platform_msi_domain_alloc_irqs(), shouldn't
> that fail if not enough IRQs are allocated (and ultimately fail the
> probe)?
No, that's not how MSIs work in Linux: this is a best effort
allocation, where the allocator is free to dish out the number of MSIs
it can allocate at this point in time (the ITS will divide the
allocation by two until it succeeds). If the end-point driver doesn't
like it, it can of course decide to fail its own probe.
> Even if not, we can still inform the firmware in write_msg,
> which will serve as an indication that a particular vector is
> enabled. The firmware can decide what to do with the device if not
> all of the vectors are enabled.
Do you anticipate that the FW will be involved at the point where the
driver is finalising the device configuration? It seems that you need
to come up with some sort of spec for this if there isn't one yet.
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list