[PATCH] iommu/mediatek: Fix crash on isr after kexec()

Ricardo Ribalda ribalda at chromium.org
Fri Nov 25 09:15:18 PST 2022


Hi Robin


Thanks for your  review!

On Fri, 25 Nov 2022 at 18:02, Robin Murphy <robin.murphy at arm.com> wrote:
>
> On 2022-11-25 16:28, Ricardo Ribalda wrote:
> > If the system is rebooted via isr(), the IRQ handler might be triggerd
> > before the domain is initialized. Resulting on an invalid memory access
> > error.
> >
> > Fix:
> > [    0.500930] Unable to handle kernel read from unreadable memory at virtual address 0000000000000070
> > [    0.501166] Call trace:
> > [    0.501174]  report_iommu_fault+0x28/0xfc
> > [    0.501180]  mtk_iommu_isr+0x10c/0x1c0
>
> Hmm, shouldn't we clear any pending faults at probe in
> mtk_iommu_hw_init(), before the IRQ is requested? mtk_iommu_isr() might
> still want to be robust against a spurious interrupt, but then it can
> simply return without doing anything at all if the domain is NULL, since
> we'll know that's the case.
>
> Thanks,
> Robin.
>
> (It might be nice if request_irq() had a flag to say "if this IRQ looks
> pending already just clear it" for drivers that know it could only be
> spurious at that point; kexec seems to lead to this problem quite a lot...)

It is not only about the "last" IRQ before kexec. The peripherals
under the IOMMU might still active and producing faults and therefore
IRQs.

I tried this:

@@ -886,6 +886,11 @@ static int mtk_iommu_hw_init(const struct
mtk_iommu_data *data, unsigned int ban
                         upper_32_bits(data->protect_base);
        writel_relaxed(regval, bankx->base + REG_MMU_IVRP_PADDR);

+       /* Clear previous IRQs */
+       regval = readl_relaxed(bankx->base + REG_MMU_INT_CONTROL0);
+       regval |= F_INT_CLR_BIT;
+       writel_relaxed(regval, bankx->base + REG_MMU_INT_CONTROL0);
+
        if (devm_request_irq(bankx->pdev, bankx->irq, mtk_iommu_isr, 0,
                             dev_name(bankx->pdev), (void *)bankx)) {
                writel_relaxed(0, bankx->base + REG_MMU_PT_BASE_ADDR);

And I still get the same crash


>
> > Signed-off-by: Ricardo Ribalda <ribalda at chromium.org>
> > ---
> > To: Yong Wu <yong.wu at mediatek.com>
> > To: Joerg Roedel <joro at 8bytes.org>
> > To: Will Deacon <will at kernel.org>
> > To: Robin Murphy <robin.murphy at arm.com>
> > To: Matthias Brugger <matthias.bgg at gmail.com>
> > Cc: iommu at lists.linux.dev
> > Cc: linux-mediatek at lists.infradead.org
> > Cc: linux-arm-kernel at lists.infradead.org
> > Cc: linux-kernel at vger.kernel.org
> > ---
> >   drivers/iommu/mtk_iommu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 2ab2ecfe01f8..17f6be5a5097 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -454,7 +454,7 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
> >               fault_larb = data->plat_data->larbid_remap[fault_larb][sub_comm];
> >       }
> >
> > -     if (report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
> > +     if (dom && report_iommu_fault(&dom->domain, bank->parent_dev, fault_iova,
> >                              write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
> >               dev_err_ratelimited(
> >                       bank->parent_dev,
> >
> > ---
> > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
> > change-id: 20221125-mtk-iommu-13023f971298
> >
> > Best regards,



-- 
Ricardo Ribalda



More information about the Linux-mediatek mailing list