[PATCH v2] iommu/arm-smmu-v3: Free MSIs in case of ENOMEM
Robin Murphy
robin.murphy at arm.com
Tue Apr 9 04:46:58 PDT 2024
On 09/04/2024 12:31 pm, Mostafa Saleh wrote:
> Hi Robin,
>
> On Tue, Apr 09, 2024 at 12:17:54PM +0100, Robin Murphy wrote:
>> On 09/04/2024 11:43 am, Mostafa Saleh wrote:
>>> Hi Aleksandr,
>>>
>>> On Wed, Apr 03, 2024 at 12:37:59PM +0700, Aleksandr Aprelkov wrote:
>>>> If devm_add_action() returns ENOMEM, then MSIs allocated but
>>>> not freed on teardown.
>>>>
>>>> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>>>>
>>>> Fixes: 166bdbd23161 ("iommu/arm-smmu: Add support for MSI on SMMUv3")
>>>> Signed-off-by: Aleksandr Aprelkov <aaprelkov at usergate.com>
>>>> ---
>>>> v2: Use appropriate function for registration failure as
>>>> Jonathan Cameron <Jonathan.Cameron at Huawei.com> suggested.
>>>>
>>>> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> index 41f93c3ab160..8800af041e5f 100644
>>>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>>>> @@ -3402,7 +3402,9 @@ static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
>>>> smmu->priq.q.irq = msi_get_virq(dev, PRIQ_MSI_INDEX);
>>>> /* Add callback to free MSIs on teardown */
>>>> - devm_add_action(dev, arm_smmu_free_msis, dev);
>>>> + ret = devm_add_action_or_reset(dev, arm_smmu_free_msis, dev);
>>>> + if (ret)
>>>> + dev_warn(dev, "failed to add free MSIs callback - falling back to wired irqs\n");
>>>
>>> I am not sure that is the right fix, as allowing the driver to probe
>>> without MSIs, seems worse than leaking MSI memory.
>>>
>>> IMHO, we can just add something like:
>>> dev_err(smmu->dev, “Can’t allocate devm action, MSIs are never freed! !\n”) ;
>>
>> Honestly I don't think this matters. If we ever really did fail to allocate
>> 16 bytes, SLUB would already be screaming and spewing stacktraces, and the
>> system is dead already.
>>
>>> Also, we can’t unconditionally fallback to wired irqs if MSI exists,
>>> according to the user manual:
>>> An implementation must support one of, or optionally both of,
>>> wired interrupts and MSIs
>>> ...
>>> The discovery of support for wired interrupts is IMPLEMENTATION DEFINED.
>>>
>>> We can add some logic, to check dt/acpi irqs and to choose to fallback
>>> or not based on that, but, if we get -ENOMEM, (especially early at
>>> probe) something really went wrong, so I am not sure it’s worth
>>> the complexity.
>>
>> That logic already exists in arm_smmu_setup_unique_irqs() - the messages
>> here are in the sense of "we're giving up on MSIs and falling back to trying
>> whatever wired IRQs we may or may not have." The critical point is that
>> we're not using MSIs for some potentially actionable reason, i.e. if the
>> user does expect the system to be MSI-capable, then it could be an
>> indication of perhaps a wrong or missing msi-parent, for which they may
>> pursue a firmware fix. In other cases it's normal and expected not to use
>> MSIs though (e.g. the system just doesn't have an ITS), so we don't want to
>> be *too* noisy about it.
>
> The case I am worried about in this patch, is for systems with
> MSIs only.
> With this patch, that means, we fallback to wired irqs which don't
> exist, so the driver will probe with no interrupts at all, which in my
> opinion worse than leaking the memory.
True, the logic looks a bit off at first glance - I was halfway through
writing a reply to that effect - but then if you look past the reality
that this is all academic since it's never really going to happen
anyway, if we *did* fail to allocate 16 bytes here, there's an
incredibly high chance that immediately proceeding into
iommu_device_sysfs_add() is also going to result in another (larger)
allocation failure which ends up aborting the whole probe anyway. Plus
the chance of subsequently being able to allocate any
domains/pagetables/etc. for any meaningful IOMMU usage would seem slim.
Honestly I'd be inclined to do nothing more than add the _or_reset to
shut the static checkers up, and not waste code and data on a useless
message for a theoretical condition at all.
Cheers,
Robin.
More information about the linux-arm-kernel
mailing list