[PATCH 0/37] PCI/MSI: Enforce explicit IRQ vector management by removing devres auto-free

Shawn Lin shawn.lin at rock-chips.com
Mon Feb 23 08:09:37 PST 2026


Hi Andy

在 2026/02/23 星期一 23:50, Andy Shevchenko 写道:
> On Mon, Feb 23, 2026 at 5:32 PM Shawn Lin <shawn.lin at rock-chips.com> wrote:
>>
>> This patch series addresses a long-standing design issue in the PCI/MSI
>> subsystem where the implicit, automatic management of IRQ vectors by
>> the devres framework conflicts with explicit driver cleanup, creating
>> ambiguity and potential resource management bugs.
>>
>> ==== The Problem: Implicit vs. Explicit Management ====
>> Historically, `pcim_enable_device()` not only manages standard PCI resources
>> (BARs) via devres but also implicitly triggers automatic IRQ vector management
>> by setting a flag that registers `pcim_msi_release()` as a cleanup action.
>>
>> This creates an ambiguous ownership model. Many drivers follow a pattern of:
>> 1. Calling `pci_alloc_irq_vectors()` to allocate interrupts.
>> 2. Also calling `pci_free_irq_vectors()` in their error paths or remove routines.
>>
>> When such a driver also uses `pcim_enable_device()`, the devres framework may
>> attempt to free the IRQ vectors a second time upon device release, leading to
>> a double-free. Analysis of the tree shows this hazardous pattern exists widely,
>> while 35 other drivers correctly rely solely on the implicit cleanup.
> 
> Is this confirmed? What I read from the cover letter, this series was
> only compile-tested, so how can you prove the problem exists in the
> first place?

Yes, it's confirmed. My debug of a double free issue of a out-of-tree
PCIe wifi driver which uses
pcim_enable_device + pci_alloc_irq_vectors + pci_free_irq_vectors expose
it. And we did have a TODO to cleanup this hybrid usage, targeted in
this cycle[1] suggested by Philipp:

[1] https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=msi

> 
>> ==== The Solution: Making Management Explicit ====
>> This series enforces a clear, predictable model:
>> 1.  New Managed API (Patch 1/37): Introduces pcim_alloc_irq_vectors() and
>>      pcim_alloc_irq_vectors_affinity(). Drivers that desire devres-managed IRQ
>>      vectors should use these functions, which set the is_msi_managed flag and
>>      ensure automatic cleanup.
>> 2.  Patches 2 through 36 convert each driver that uses pcim_enable_device() alongside
>>      pci_alloc_irq_vectors() and relies on devres for IRQ vector cleanup to instead
>>      make an explicit call to pcim_alloc_irq_vectors().
>> 3.  Core Change (Patch 37/37): With the former cleanup, now modifies pcim_setup_msi_release()
>>      to check only the is_msi_managed flag. This decouples automatic IRQ cleanup from
>>      pcim_enable_device(). IRQ vectors allocated via pci_alloc_irq_vectors*()
>>      are now solely the driver's responsibility to free with pci_free_irq_vectors().
>>
>> With these changes, we clear ownership model: Explicit resource management eliminates
>> ambiguity and follows the "principle of least surprise." New drivers choose one model and
>> be consistent.
>> - Use `pci_alloc_irq_vectors()` + `pci_free_irq_vectors()` for explicit control.
>> - Use `pcim_alloc_irq_vectors()` for devres-managed, automatic cleanup.
> 
> Have you checked previous attempts? Why is your series better than those?

There seems not previous attempts.

> 
>> ==== Testing And Review ====
>> 1. This series is only compiled test with allmodconfig.
>> 2. Given the substantial size of this patch series, I have structured the mailing
>>     to facilitate efficient review. The cover letter, the first patch and the last one will be sent
>>     to all relevant mailing lists and key maintainers to ensure broad visibility and
>>     initial feedback on the overall approach. The remaining subsystem-specific patches
>>     will be sent only to the respective subsystem maintainers and their associated
>>     mailing lists, reducing noise.
> 



More information about the linux-riscv mailing list