[PATCH rc] iommu/arm-smmu: Use the correct type in nvidia_smmu_context_fault()
Jerry Snitselaar
jsnitsel at redhat.com
Thu May 9 12:30:42 PDT 2024
On Thu, May 09, 2024 at 12:26:36PM GMT, Jerry Snitselaar wrote:
> On Thu, May 09, 2024 at 11:51:55AM GMT, Jerry Snitselaar wrote:
> > On Thu, May 09, 2024 at 02:45:51PM GMT, Jason Gunthorpe wrote:
> > > This was missed because of the function pointer indirection.
> > >
> > > nvidia_smmu_context_fault() is also installed as a irq function, and the
> > > 'void *' was changed to a struct arm_smmu_domain. Since the iommu_domain
> > > is embedded at a non-zero offset this causes nvidia_smmu_context_fault()
> > > to miscompute the offset. Fixup the types.
> > >
> > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120
> > > Mem abort info:
> > > ESR = 0x0000000096000004
> > > EC = 0x25: DABT (current EL), IL = 32 bits
> > > SET = 0, FnV = 0
> > > EA = 0, S1PTW = 0
> > > FSC = 0x04: level 0 translation fault
> > > Data abort info:
> > > ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> > > CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > > user pgtable: 4k pages, 48-bit VAs, pgdp=0000000107c9f000
> > > [0000000000000120] pgd=0000000000000000, p4d=0000000000000000
> > > Internal error: Oops: 0000000096000004 [#1] SMP
> > > Modules linked in:
> > > CPU: 1 PID: 47 Comm: kworker/u25:0 Not tainted 6.9.0-0.rc7.58.eln136.aarch64 #1
> > > Hardware name: Unknown NVIDIA Jetson Orin NX/NVIDIA Jetson Orin NX, BIOS 3.1-32827747 03/19/2023
> > > Workqueue: events_unbound deferred_probe_work_func
> > > pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > pc : nvidia_smmu_context_fault+0x1c/0x158
> > > lr : __free_irq+0x1d4/0x2e8
> > > sp : ffff80008044b6f0
> > > x29: ffff80008044b6f0 x28: ffff000080a60b18 x27: ffffd32b5172e970
> > > x26: 0000000000000000 x25: ffff0000802f5aac x24: ffff0000802f5a30
> > > x23: ffff0000802f5b60 x22: 0000000000000057 x21: 0000000000000000
> > > x20: ffff0000802f5a00 x19: ffff000087d4cd80 x18: ffffffffffffffff
> > > x17: 6234362066666666 x16: 6630303078302d30 x15: ffff00008156d888
> > > x14: 0000000000000000 x13: ffff0000801db910 x12: ffff00008156d6d0
> > > x11: 0000000000000003 x10: ffff0000801db918 x9 : ffffd32b50f94d9c
> > > x8 : 1fffe0001032fda1 x7 : ffff00008197ed00 x6 : 000000000000000f
> > > x5 : 000000000000010e x4 : 000000000000010e x3 : 0000000000000000
> > > x2 : ffffd32b51720cd8 x1 : ffff000087e6f700 x0 : 0000000000000057
> > > Call trace:
> > > nvidia_smmu_context_fault+0x1c/0x158
> > > __free_irq+0x1d4/0x2e8
> > > free_irq+0x3c/0x80
> > > devm_free_irq+0x64/0xa8
> > > arm_smmu_domain_free+0xc4/0x158
> > > iommu_domain_free+0x44/0xa0
> > > iommu_deinit_device+0xd0/0xf8
> > > __iommu_group_remove_device+0xcc/0xe0
> > > iommu_bus_notifier+0x64/0xa8
> > > notifier_call_chain+0x78/0x148
> > > blocking_notifier_call_chain+0x4c/0x90
> > > bus_notify+0x44/0x70
> > > device_del+0x264/0x3e8
> > > pci_remove_bus_device+0x84/0x120
> > > pci_remove_root_bus+0x5c/0xc0
> > > dw_pcie_host_deinit+0x38/0xe0
> > > tegra_pcie_config_rp+0xc0/0x1f0
> > > tegra_pcie_dw_probe+0x34c/0x700
> > > platform_probe+0x70/0xe8
> > > really_probe+0xc8/0x3a0
> > > __driver_probe_device+0x84/0x160
> > > driver_probe_device+0x44/0x130
> > > __device_attach_driver+0xc4/0x170
> > > bus_for_each_drv+0x90/0x100
> > > __device_attach+0xa8/0x1c8
> > > device_initial_probe+0x1c/0x30
> > > bus_probe_device+0xb0/0xc0
> > > deferred_probe_work_func+0xbc/0x120
> > > process_one_work+0x194/0x490
> > > worker_thread+0x284/0x3b0
> > > kthread+0xf4/0x108
> > > ret_from_fork+0x10/0x20
> > > Code: a9b97bfd 910003fd a9025bf5 f85a0035 (b94122a1)
> > >
> > > Cc: stable at vger.kernel.org
> > > Fixes: e0976331ad11 ("iommu/arm-smmu: Pass arm_smmu_domain to internal functions")
> > > Reported-by: Jerry Snitselaar <jsnitsel at redhat.com>
> > > Closes: https://lore.kernel.org/all/jto5e3ili4auk6sbzpnojdvhppgwuegir7mpd755anfhwcbkfz@2u5gh7bxb4iv
> > > Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> >
> > Tested-by: Jerry Snitselaar <jsnitsel at redhat.com>
> > Acked-by: Jerry Snitselaar <jsnitsel at redhat.com>
>
> Actually looking at it again, does arm_smmu_context_fault need to be
> updated as well? The devm_request_irq call is getting passed the
> smmu_domain whether context_fault is arm_smmu_context_fault or
> nvidia_smmu_context_fault.
>
Never mind. I can't read today.
> >
> > > ---
> > > drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 4 +---
> > > 1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > Joerg, once Jerry ack's this you should grab it for this cycle.
> > >
> > > Thanks,
> > > Jason
> > >
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> > > index 87bf522b9d2eec..957d988b6d832f 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
> > > @@ -221,11 +221,9 @@ static irqreturn_t nvidia_smmu_context_fault(int irq, void *dev)
> > > unsigned int inst;
> > > irqreturn_t ret = IRQ_NONE;
> > > struct arm_smmu_device *smmu;
> > > - struct iommu_domain *domain = dev;
> > > - struct arm_smmu_domain *smmu_domain;
> > > + struct arm_smmu_domain *smmu_domain = dev;
> > > struct nvidia_smmu *nvidia;
> > >
> > > - smmu_domain = container_of(domain, struct arm_smmu_domain, domain);
> > > smmu = smmu_domain->smmu;
> > > nvidia = to_nvidia_smmu(smmu);
> > >
> > >
> > > base-commit: dff9180946cc45d90a77e1c8645989cdcfd31437
> > > --
> > > 2.43.2
> > >
> >
>
More information about the linux-arm-kernel
mailing list