[PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
Dmitry Baryshkov
dmitry.baryshkov at linaro.org
Mon Feb 12 23:51:58 PST 2024
On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg at nvidia.com> wrote:
>
> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 17 Oct 2023 Jason Gunthorpe <jgg at nvidia.com> wrote:
> > > Now that the BLOCKED and IDENTITY behaviors are managed with their own
> > > domains change to the domain_alloc_paging() op.
> > >
> > > The check for using_legacy_binding is now redundant,
> > > arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> > > mode, so the core code will never attempt to create a DMA domain in the
> > > first place.
> > >
> > > Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> > > code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> > > not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> > > for DMA is also redundant now too.
> > >
> > > Call arm_smmu_init_domain_context() early if a dev is available.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> > > ---
> > > drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> > > 1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> > (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> > devices makes the board boot in most of the cases. Most frequently the
> > last parts of the log loog in a following way:
>
> It is surprising we tested this patch on some tegra systems with this
> iommu and didn't hit anything..
>
> The only real functional thing this changes is to move the domain
> initialization up in time, potentially a lot in time in some
> cases. That function does alot of things including touching HW so
> possibly there is some surprising interaction with something else.
I should not be debugging strange platforms at 1 a.m. I forgot that
there was another patch to revert. So after reverting the MPM patch,
I'm getting the following results:
>
> So, I would expect this to not WARN_ON and to make it work the same as
> before the patch:
No warnings, the platform now boots up to the point of actually
bringing up the venus device:
[ 11.906514] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
0x05030000 chip_id 0x00340aff sub 0000:0000
[ 11.907119] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 0
tracing 0 dfs 0 testmode 0
[ 11.915881] ath10k_pci 0000:01:00.0: firmware ver
WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp crc32
bf907c7c
[ 11.979972] Console: switching to colour frame buffer device 320x90
[ 11.990756] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1
crc32 d2863f91
[ 12.060834] msm_mdp 901000.display-controller: [drm] fb0: msmdrmfb
frame buffer device
[ 12.096203] qcom-pcie 608000.pcie: Phy link never came up
[ 12.103785] qcom-pcie 608000.pcie: PCI host bridge to bus 0001:00
[ 12.103970] qcom-venus c00000.video-codec: Adding to iommu group 3
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
S - IMAGE_VARIANT_STRING=M8996LAB
S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
S - Boot Interface: UFS
>
> Then I'd ask you to remove the comment and do:
>
> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> if (dev) {
> struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
>
> + WARN_ON(true);
> if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> + printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
> kfree(smmu_domain);
> return NULL;
> }
>
>
> And then we may get a clue from the backtraces it generates. I only
> saw one iommu group reported in your log so I'd expect one trace?
I added dev_info + mdelays() around the arm_smmu_init_domain_context()
and I can see that it crashes within that function.
[ 29.819624] qcom-venus c00000.video-codec: Adding to iommu group 1
[ 29.833181] ------------[ cut here ]------------
[ 29.839198] WARNING: CPU: 1 PID: 35 at
drivers/iommu/arm/arm-smmu/arm-smmu.c:883
arm_smmu_domain_alloc_paging+0x80/0x174
[ 29.843980] Modules linked in:
[ 29.854824] CPU: 1 PID: 35 Comm: kworker/u18:0 Tainted: G U
6.8.0-rc3-next-20240208-05495-g20708c29957d-dirty #1739
[ 29.857694] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[ 29.869410] Workqueue: events_unbound deferred_probe_work_func
[ 29.875658] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 29.881474] pc : arm_smmu_domain_alloc_paging+0x80/0x174
[ 29.888331] lr : arm_smmu_domain_alloc_paging+0x68/0x174
[ 29.893885] sp : ffff8000830338c0
[ 29.899179] x29: ffff8000830338c0 x28: 0000000000000000 x27: ffff800081e72000
[ 29.902396] x26: ffff00008034ee48 x25: ffff000080b24810 x24: 0000000000000000
[ 29.909513] x23: ffff800081e73000 x22: ffff000080b24810 x21: ffff800082e23258
[ 29.916633] x20: ffff00008389a700 x19: ffff00008034f600 x18: ffffffffffffffff
[ 29.918788] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[ 29.923746] x17: 0000000c0000000b x16: 0000000900000008 x15: 0000000000000000
[ 29.923765] x14: 000000000000b0af x13: 0000000000000000 x12: 0000000000000166
[ 29.923783] x11: 0000000000000001 x10: 0000000000001410 x9 : 0000000000000000
[ 29.923801] x8 : ffff00008034f800 x7 : 0000000000000000 x6 : 0000000000000000
[ 29.923819] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000000
[ 29.923837] x2 : ffff800082e23290 x1 : dead4ead00000000 x0 : 0000000000000000
[ 29.923855] Call trace:
[ 29.923861] arm_smmu_domain_alloc_paging+0x80/0x174
[ 29.923872] __iommu_domain_alloc+0xcc/0xf4
[ 29.923884] iommu_setup_default_domain+0x294/0x554
[ 29.938567] Bluetooth: hci0: Frame reassembly failed (-84)
[ 29.944494] __iommu_probe_device+0x418/0x43c
[ 29.944508] iommu_probe_device+0x3c/0x80
[ 29.944519] of_iommu_configure+0x124/0x1b4
[ 29.944529] of_dma_configure_id+0x170/0x2f4
[ 29.969874] mmc0: new ultra high speed SDR104 SDHC card at address 5048
[ 29.972966] platform_dma_configure+0xa8/0xb4
[ 29.972983] really_probe+0x70/0x2ac
[ 29.972992] __driver_probe_device+0x78/0x12c
[ 29.973001] driver_probe_device+0xd8/0x160
[ 29.973010] __device_attach_driver+0xb8/0x138
[ 29.973019] bus_for_each_drv+0x80/0xdc
[ 29.973027] __device_attach+0x9c/0x188
[ 29.973037] device_initial_probe+0x14/0x20
[ 29.973046] bus_probe_device+0xac/0xb0
[ 29.973055] deferred_probe_work_func+0x8c/0xc8
[ 29.973064] process_one_work+0x210/0x5e4
[ 29.983596] mmcblk0: mmc0:5048 SD32G 28.8 GiB
[ 29.987546] worker_thread+0x1bc/0x38c
[ 29.987558] kthread+0x120/0x124
[ 29.987568] ret_from_fork+0x10/0x20
[ 29.987579] irq event stamp: 109977
[ 29.987584] hardirqs last enabled at (109977):
[<ffff800080fbbc48>] _raw_spin_unlock_irqrestore+0x6c/0x70
[ 29.987600] hardirqs last disabled at (109976):
[<ffff800080fbb0a8>] _raw_spin_lock_irqsave+0x84/0x88
[ 29.987610] softirqs last enabled at (109966):
[<ffff800080090680>] __do_softirq+0x498/0x4e0
[ 29.987619] softirqs last disabled at (109961):
[<ffff800080096184>] ____do_softirq+0x10/0x1c
[ 30.006747] mmcblk0: p1
[ 30.010291] ---[ end trace 0000000000000000 ]---
[ 30.018630] remoteproc remoteproc1: remote processor
9300000.remoteproc is now up
[ 30.024525] qcom-pcie 600000.pcie: iATU: unroll F, 32 ob, 8 ib,
align 4K, limit 4G
[ 30.044747] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
Adding APR/GPR dev: aprsvc:service:4:3
[ 30.046118] qcom-pcie 600000.pcie: Invalid eDMA IRQs found
[ 30.051718] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
Adding APR/GPR dev: aprsvc:service:4:4
[ 30.066435] Bluetooth: hci0: QCA Downloading qca/nvm_00440302.bin
[ 30.130736] hub 1-1:1.0: USB hub found
[ 30.150390] qcom-pcie 600000.pcie: PCIe Gen.1 x1 link up
[ 30.156394] hub 1-1:1.0: 4 ports detected
[ 30.161837] qcom-pcie 600000.pcie: PCI host bridge to bus 0000:00
[ 30.189583] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 30.195652] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
[ 30.201035] pci_bus 0000:00: root bus resource [mem 0x0c300000-0x0cffffff]
[ 30.205424] Bluetooth: hci0: QCA setup on UART is completed
[ 30.207262] pci 0000:00:00.0: [17cb:0104] type 01 class 0x060400
PCIe Root Port
[ 30.214380] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[ 30.219636] qcom-venus c00000.video-codec: Allocating domain
[ 30.221503] pci 0000:00:00.0: BAR 0 [mem 0x00000000-0x00000fff]
[ 30.221680] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
[ 30.221772] pci 0000:00:00.0: bridge window [io 0x0000-0x0fff]
[ 30.221832] pci 0000:00:00.0: bridge window [mem 0x00000000-0x000fffff]
[ 30.221945] pci 0000:00:00.0: bridge window [mem
0x00000000-0x000fffff 64bit pref]
[ 30.222617] pci 0000:00:00.0: PME# supported from D0 D3hot
[ 30.273673] hub 2-1:1.0: USB hub found
[ 30.276567] hub 2-1:1.0: 4 ports detected
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset), D - Delta, S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
S - IMAGE_VARIANT_STRING=M8996LAB
S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
S - Boot Interface: UFS
I traced this further, it crashes during arm_smmu_write_context_bank().
--
With best wishes
Dmitry
More information about the linux-arm-kernel
mailing list