MSIs not freed in GICv3 ITS driver

Manivannan Sadhasivam manivannan.sadhasivam at linaro.org
Sun Jul 21 01:50:32 PDT 2024


On Tue, Jul 09, 2024 at 08:24:37PM +0100, Marc Zyngier wrote:
> On Tue, 09 Jul 2024 18:37:08 +0100,
> Manivannan Sadhasivam <manivannan.sadhasivam at linaro.org> wrote:
> > 
> > On Mon, Jul 08, 2024 at 06:31:57PM +0100, Marc Zyngier wrote:
> > > Mani,
> > > 
> > > On Mon, 08 Jul 2024 16:39:33 +0100,
> > > Manivannan Sadhasivam <manivannan.sadhasivam at linaro.org> wrote:
> > > > 
> > > > Hi Marc, Thomas,
> > > > 
> > > > I'm seeing a weird behavior with GICv3 ITS driver while allocating MSIs from
> > > > PCIe devices. When the PCIe driver (I'm using virtio_pci_common.c) tries to
> > > > allocate non power of 2 MSIs (like 3), then the GICv3 MSI driver always rounds
> > > > the MSI count to power of 2 to find the order. In this case, the order becomes 2
> > > > in its_alloc_device_irq().
> > > 
> > > That's because we can only allocate EventIDs as a number of ID
> > > bits. So you can't have *1* MSI, nor 3. You can have 2, 4, 8, or
> > > 2^24. This is a power-of-two architecture.
> > > 
> > 
> > Ah okay.
> > 
> > > > So 4 entries are allocated by bitmap_find_free_region().
> > > 
> > > Assuming you're calling about its_alloc_device_irq(), it looks like a
> > > bug. Or rather, some laziness on my part. The thing is, this bitmap is
> > > only dealing with sub-allocation in the pool that has been given to
> > > the endpoint. So the power-of-two crap doesn't really matter unless
> > > you are dealing with Multi-MSI, which has actual alignment
> > > requirements.
> > > 
> > 
> > Okay.
> > 
> > > >
> > > > But since the PCIe driver has only requested 3 MSIs, its_irq_domain_alloc()
> > > > will only allocate 3 MSIs, leaving one bitmap entry unused.
> > > > 
> > > > And when the driver frees the MSIs using pci_free_irq_vectors(), only 3
> > > > allocated MSIs were freed and their bitmap entries were also released. But the
> > > > entry for the additional bitmap was never released. Due to this,
> > > > its_free_device() was also never called, resulting in the ITS device not getting
> > > > freed.
> > > > 
> > > > So when the PCIe driver tries to request the MSIs again (PCIe device being
> > > > removed and inserted back), because the ITS device was not freed previously,
> > > > MSIs were again requested for the same ITS device. And due to the stale bitmap
> > > > entry, the ITS driver refuses to allocate 4 MSIs as only 3 bitmap entries were
> > > > available. This forces the PCIe driver to reduce the MSI count, which is sub
> > > > optimal.
> > > > 
> > > > This behavior might be applicable to other irqchip drivers handling MSI as well.
> > > > I want to know if this behavior is already known with MSI and irqchip drivers?
> > > > 
> > > > For fixing this issue, the PCIe drivers could always request MSIs of power of 2,
> > > > and use a dummy MSI handler for the extra number of MSIs allocated. This could
> > > > also be done in the generic MSI driver itself to avoid changes in the PCIe
> > > > drivers. But I wouldn't say it is the best possible fix.
> > > 
> > > No, that's terrible. This is just papering over a design mistake, and
> > > I refuse to go down that road.
> > > 
> > 
> > Agree. But what about other MSI drivers? And because of the MSI design, they
> > also round the requested MSI count to power of 2, leading to unused vectors and
> > those also wouldn't get freed.
> 
> This has absolutely nothing to do with the "design" of MSIs. It has
> everything to do with not special-casing Multi-MSI.
> 
> > I think this power of 2 limitation should be
> > imposed at the API level or in the MSI driver instead of silently keeping unused
> > vectors in irqchip drivers.
> 
> You really have the wrong end of the stick. The MSi API has *zero*
> control over the payload allocation. How could it? The whole point of
> having an MSI driver is to insulate the core code from such stuff.
> 

Right, but because of the way most of the MSI drivers (not all?) use bitmap to
allocate MSIs, this issue is also present in all of them. So I think the fix is
not just applicable for the gic-its-v3 driver alone.

> >
> > > > 
> > > > Is there any other way to address this issue? Or am I missing something
> > > > completely?
> > > 
> > > Well, since each endpoint handled by an ITS has its allocation tracked
> > > by a bitmap, it makes more sense to precisely track the allocation.
> > > 
> > > Here's a quick hack that managed to survive a VM boot. It may even
> > > work. The only problem with it is that it probably breaks a Multi-MSi
> > > device sitting behind a non-transparent bridge that would get its MSIs
> > > allocated after another device. In this case, we wouldn't honor the
> > > required alignment and things would break.
> > > 
> > > So take this as a proof of concept. If that works, I'll think of how
> > > to deal with this crap in a more suitable way...
> > > 
> > 
> > This works fine. Now the ITS driver allocates requested number of MSIs, thanks!
> 
> Well, as I said, this breaks tons of other things so I'm not going to
> merge this any time soon. Certainly not before Thomas gets his MSI
> rework upstream. And then I need to work out how to deal with
> Multi-MSI in the correct way.
> 
> So don't hold your breath.
> 

Sure thing. Thanks for getting it around quickly. I'll wait for a proper fix.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the linux-arm-kernel mailing list