[EXTERNAL] Re: [PATCH] PCI: cadence: Set the AFS bit in Device Capabilities 2 Register
Bjorn Helgaas
helgaas at kernel.org
Tue Aug 8 15:49:10 PDT 2023
On Fri, Aug 04, 2023 at 01:22:56PM +0530, Verma, Achal wrote:
> On 8/2/2023 9:49 PM, Bjorn Helgaas wrote:
> > In subject, "Advertise ARI Forwarding Supported".
> Ok
> >
> > It's not obvious that "AFS" refers to ARI Forwarding Supported, and
> > the bit name is enough; we don't need to know that it's in Dev Cap 2.
> > "Advertise" shows that we're just *advertising* the functionality, not
> > *enabling* it.
> >
> > On Wed, Aug 02, 2023 at 04:00:59PM +0530, Achal Verma wrote:
> > > J7 PCIe Root Complex has ARI Forwarding Support, means supporting
> > > forwarding of TLPs addressed to functions with function number greater than
> > > 7 but some PCIe instances on J7 have this bit cleared which results in
> > > failure of forwarding of TLPs destined for function number > 7.
> > > Setting the AFS bit in Device Capabilities 2 Register explicitly, resolves
> > > the issue and leads to successful access to function number > 7.
> >
> > s/AFS/ARI Forwarding Supported/
> >
> > > Some observations:
> > > 1. J7200-EVB has single PCIe instance(PCIe1) for which ARIFwd bit is not
> > > set. Enumeration gracefully fails for funciton number greater than 7 but
> > > later read/write access to these funcitons results in a crash.
> >
> > By "ARIFwd bit" here, I assume you mean PCI_EXP_DEVCAP2_ARI in the Root
> > Port? Maybe you can use the #define to make this more greppable.
> >
> will replace with PCI_EXP_DEVCAP2_ARI
> > s/funciton/function/ (twice)
> >
> > If we don't enumerate function numbers greater than 7, we shouldn't
> > have pci_dev structs for them, so why are there later read/write
> > config accesses to them?
> >
> > If the Root Port doesn't advertise ARI Forwarding Supported,
> > bridge->ari_enabled will not be set, and we shouldn't even try to
> > enumerate functions greater than 7. So it's not that enumeration
> > *fails*; it just doesn't happen at all.
> >
> > > 2. On J721E-EVB, PCIe1 instance has ARIFwd bit set while it is cleared for
> > > PCIe0 instance. This issue combined with errata i2086
> > > (Unsupported Request (UR) Response Results in External Abort) results in
> > > SERROR while scanning multi-function endpoint device.
> >
> > Is the SERROR when scanning under PCIe0 or under PCIe1?
> >
> > I'm not clear on what's happening here:
> >
> > 1) Root Port advertises PCI_EXP_DEVCAP2_ARI, we set
> > bridge->ari_enabled and scan functions > 7, we do a config read
> > to function 8, get a UR response (as expected during enumeration)
> > and that results in SERROR?
> >
> > 2) Root Port *doesn't* advertise PCI_EXP_DEVCAP2_ARI, we don't set
> > bridge->ari_enabled, so we don't try config read to function 8,
> > and something blows up later?
> >
> > 3) Something else?
> >
> Hello Bjorn,
>
> There are multiple issues which are leading to different behavior on
> different platforms.
>
> 1. PCI_EXP_DEVCAP2_ARI is not set.
>
> Consider scenario:
> J7200 (RC) --- J721E (EP with 1 PF and 4 VFs)
>
> PF enumerates successfully on J7200 but bringing up 4 associated VFs (echo 4
> > /sys/bus/pci/devices/<device>/sriov_numvfs) doesn't workout. First VF gets
> devfn = 6 and then +1 for next one on wards. 6 and 7 are setup fine but 8
> and 9 fails and UR is received while accessing them. Accessing VFs > 7
> doesn't go through the ARI support check and directly calls
> pci_setup_device(). So, since PCI_EXP_DEVCAP2_ARI is not set, unable to
> access VFs > 7.
>
> do_serror+0x34/0x88
> el1_error+0x8c/0x10c
> pci_generic_config_read+0x90/0xd0
> cdns_ti_pcie_config_read+0x14/0x28
> pci_bus_read_config_word+0x78/0xd0
> __pci_bus_find_cap_start+0x2c/0x78
> pci_find_capability+0x38/0x90
> set_pcie_port_type+0x2c/0x150
> pci_setup_device+0x90/0x728
> pci_iov_add_virtfn+0xe4/0x2e0
> sriov_enable+0x1f0/0x440
> pci_sriov_configure_simple+0x34/0x80
> sriov_numvfs_store+0xa4/0x190
Thanks! Obviously you should make the Root Port advertise
PCI_EXP_DEVCAP2_ARI if you want to use that functionality.
But I think the fact that we add a device with fn > 7 when ARI is not
enabled is also an underlying defect in iov.c.
sriov_init() already checks whether ARI is enabled, and I think we
should probably remember that somewhere and use it in
pci_iov_add_virtfn() to avoid adding VFs with fn > 7.
Bjorn
More information about the linux-arm-kernel
mailing list