PCIe probe failure on AmLogic A311D after 6.18-rc1
Neil Armstrong
neil.armstrong at linaro.org
Mon Nov 3 01:51:46 PST 2025
Hi Bjorn,
On 10/31/25 17:13, Bjorn Helgaas wrote:
> On Fri, Oct 31, 2025 at 08:26:42PM +0800, Linnaea Lavia wrote:
>> On 10/31/2025 4:50 PM, Neil Armstrong wrote:
>>> On 10/31/25 06:34, Linnaea Lavia wrote:
>>>> On 10/30/2025 1:15 AM, Bjorn Helgaas wrote:
>>>>> On Wed, Oct 29, 2025 at 06:50:46PM +0800, Linnaea Lavia wrote:
>>>>>> On 10/29/2025 6:16 AM, Bjorn Helgaas wrote:
>>>>>
>>>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>>>> index 214ed060ca1b..9cd12924b5cb 100644
>>>>>>> --- a/drivers/pci/quirks.c
>>>>>>> +++ b/drivers/pci/quirks.c
>>>>>>> @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>>>>>>> * disable both L0s and L1 for now to be safe.
>>>>>>> */
>>>>>>> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>>>>>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SYNOPSYS, 0xabcd, quirk_disable_aspm_l0s_l1);
>>>>>>> /*
>>>>>>> * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
>>>>>>
>>>>>> I have applied the patch on 6.18-rc3 but it's still trying to enable ASPM for some reasons.
>>>>>
>>>>> Sorry, my fault, I should have made that fixup run earlier, so the
>>>>> patch should be this instead:
>>>>>
>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>> index 214ed060ca1b..4fc04015ca0c 100644
>>>>> --- a/drivers/pci/quirks.c
>>>>> +++ b/drivers/pci/quirks.c
>>>>> @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>>>>> * disable both L0s and L1 for now to be safe.
>>>>> */
>>>>> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>>>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd, quirk_disable_aspm_l0s_l1);
>>>>
>>>> L1 still got enabled
>
> Is that based on the output below?
>
> [ 5.445853] [ T48] pci 0000:00:00.0: Disabling ASPM L0s/L1
> [ 5.560448] [ T48] pci 0000:01:00.0: ASPM: default states L1
>
> If so, this doesn't necessarily mean L1 was enabled. It means the
> quirk marked the 00:00.0 Root Port so we shouldn't ever enable L0s or
> L1, and when we enumerated 01:00.0, we set its default ASPM state to
> L1.
>
> But I don't *think* L1 should actually be enabled unless we can enable
> it for both 00:00.0 and 01:00.0, and the quirk should mean that we
> can't enable it for 00:00.0.
>
> This muddle of "capable" (per Link Capabilities) vs "disabled" (either
> the Link Control shows disabled, or software said "don't ever use L1")
> is part of what makes aspm.c so confusing.
>
>>>> The card works just fine. I'm thinking the ASPM issue is
>>>> probably from the glue driver reporting the link to be down when
>>>> it's really just in low power state.
>>>
>>> You're probably right, the meson_pcie_link_up() not only checks
>>> the LTSSM but also the speed, which is probably wrong.
>>>
>>> Can you try removing the test for speed ?
>>>
>>> - if (smlh_up && rdlh_up && ltssm_up && speed_okay)
>>> + if (smlh_up && rdlh_up && ltssm_up)
>>>
>>> The other drivers just checks the link, and some only the smlh_up
>>> && rdlh_up. So you can also probably drop ltssm_up aswell.
>>
>> I can confirm that removing the check for ltssm_up and speed_okay
>> made ASPM work.
>
> I don't think meson_pcie_link_up() should have the loop in it, so the
> ltssm_up and speed_okay checks, the loop, the delay, and the timeout
> message should probably all be removed. That method is supposed to be
> a simple true/false check, and any waiting required should be done in
> dw_pcie_wait_for_link().
>
> The link was clearly up when we discovered 01:00.0, so the "wait
> linkup timeout" messages from meson_pcie_link_up() after that must be
> from dw_pcie_link_up() being called via the .map_bus() call in
> pci_generic_config_read() or pci_generic_config_write().
>
> When meson_pcie_link_up() returns false in those config accessors,
> the config accesses will fail (they won't even be attempted), so we'll
> see things like this:
>
> pci 0000:01:00.0: BAR 0: error updating (0xfc700004 != 0xffffffff)
>
> and "Unknown header type 7f" from lspci.
>
> Can you drop the ASPM quirk patch and instead try the
> meson_pcie_link_up() patch below on top of v6.18-rc3?
>
>> We still need a solution to the original issue that's preventing the
>> controller from being initialized.
>>
>> My kernel has the following patch applied, but I think it's not
>> suitable for upstream as this changes device tree bindings for PCIe
>> controller on meson.
>
> I assume the original issue is this:
>
> meson-pcie fc000000.pcie: error -EBUSY: can't request region for resource [mem 0xfc000000-0xfc3fffff]
>
> and you confirmed that it wasn't fixed by a1978b692a39 ("PCI: dwc: Use
> custom pci_ops for root bus DBI vs ECAM config access"), which
> appeared in v6.18-rc3?
>
> If it's still broken in v6.18-rc3, and the dtsi and
> meson_pcie_get_mems() patch below makes it work, we have more work to
> do, and maybe Krishna has some ideas.
>
>> diff --git a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi
>> index dcc927a9da80..ca455f634834 100644
>> --- a/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi
>> +++ b/arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi
>> @@ -138,7 +138,7 @@ pcie: pcie at fc000000 {
>> reg = <0x0 0xfc000000 0x0 0x400000>,
>> <0x0 0xff648000 0x0 0x2000>,
>> <0x0 0xfc400000 0x0 0x200000>;
>> - reg-names = "elbi", "cfg", "config";
>> + reg-names = "dbi", "cfg", "config";
>> interrupts = <GIC_SPI 221 IRQ_TYPE_LEVEL_HIGH>;
>> #interrupt-cells = <1>;
>> interrupt-map-mask = <0 0 0 0>;
>> diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c
>> index 787469d1b396..404c4d9e1900 100644
>> --- a/drivers/pci/controller/dwc/pci-meson.c
>> +++ b/drivers/pci/controller/dwc/pci-meson.c
>> @@ -109,10 +109,6 @@ static int meson_pcie_get_mems(struct platform_device *pdev,
>> {
>> struct dw_pcie *pci = &mp->pci;
>> - pci->dbi_base = devm_platform_ioremap_resource_byname(pdev, "elbi");
>> - if (IS_ERR(pci->dbi_base))
>> - return PTR_ERR(pci->dbi_base);
>> -
>> mp->cfg_base = devm_platform_ioremap_resource_byname(pdev, "cfg");
>> if (IS_ERR(mp->cfg_base))
>> return PTR_ERR(mp->cfg_base);
>
> diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c
> index 787469d1b396..13685d89227a 100644
> --- a/drivers/pci/controller/dwc/pci-meson.c
> +++ b/drivers/pci/controller/dwc/pci-meson.c
> @@ -338,40 +338,10 @@ static struct pci_ops meson_pci_ops = {
> static bool meson_pcie_link_up(struct dw_pcie *pci)
> {
> struct meson_pcie *mp = to_meson_pcie(pci);
> - struct device *dev = pci->dev;
> - u32 speed_okay = 0;
> - u32 cnt = 0;
> - u32 state12, state17, smlh_up, ltssm_up, rdlh_up;
> + u32 state12;
>
> - do {
> - state12 = meson_cfg_readl(mp, PCIE_CFG_STATUS12);
> - state17 = meson_cfg_readl(mp, PCIE_CFG_STATUS17);
> - smlh_up = IS_SMLH_LINK_UP(state12);
> - rdlh_up = IS_RDLH_LINK_UP(state12);
> - ltssm_up = IS_LTSSM_UP(state12);
> -
> - if (PM_CURRENT_STATE(state17) < PCIE_GEN3)
> - speed_okay = 1;
> -
> - if (smlh_up)
> - dev_dbg(dev, "smlh_link_up is on\n");
> - if (rdlh_up)
> - dev_dbg(dev, "rdlh_link_up is on\n");
> - if (ltssm_up)
> - dev_dbg(dev, "ltssm_up is on\n");
> - if (speed_okay)
> - dev_dbg(dev, "speed_okay\n");
> -
> - if (smlh_up && rdlh_up && ltssm_up && speed_okay)
> - return true;
> -
> - cnt++;
> -
> - udelay(10);
> - } while (cnt < WAIT_LINKUP_TIMEOUT);
> -
> - dev_err(dev, "error: wait linkup timeout\n");
> - return false;
> + state12 = meson_cfg_readl(mp, PCIE_CFG_STATUS12);
> + return IS_SMLH_LINK_UP(state12) && IS_RDLH_LINK_UP(state12);
> }
>
> static int meson_pcie_host_init(struct dw_pcie_rp *pp)
This change looks fine, please submit it so I can CI test it.
Thanks,
Neil
More information about the linux-amlogic
mailing list