PCIe probe failure on AmLogic A311D after 6.18-rc1

Linnaea Lavia linnaea-von-lavia at live.com
Fri Oct 31 19:23:52 PDT 2025


On 11/1/2025 12:13 AM, Bjorn Helgaas wrote:
> On Fri, Oct 31, 2025 at 08:26:42PM +0800, Linnaea Lavia wrote:
>> On 10/31/2025 4:50 PM, Neil Armstrong wrote:
>>> On 10/31/25 06:34, Linnaea Lavia wrote:
>>>> On 10/30/2025 1:15 AM, Bjorn Helgaas wrote:
>>>>> On Wed, Oct 29, 2025 at 06:50:46PM +0800, Linnaea Lavia wrote:
>>>>>> On 10/29/2025 6:16 AM, Bjorn Helgaas wrote:
>>>>>
>>>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>>>> index 214ed060ca1b..9cd12924b5cb 100644
>>>>>>> --- a/drivers/pci/quirks.c
>>>>>>> +++ b/drivers/pci/quirks.c
>>>>>>> @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>>>>>>>      * disable both L0s and L1 for now to be safe.
>>>>>>>      */
>>>>>>>     DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>>>>>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SYNOPSYS, 0xabcd, quirk_disable_aspm_l0s_l1);
>>>>>>>     /*
>>>>>>>      * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
>>>>>>
>>>>>> I have applied the patch on 6.18-rc3 but it's still trying to enable ASPM for some reasons.
>>>>>
>>>>> Sorry, my fault, I should have made that fixup run earlier, so the
>>>>> patch should be this instead:
>>>>>
>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>> index 214ed060ca1b..4fc04015ca0c 100644
>>>>> --- a/drivers/pci/quirks.c
>>>>> +++ b/drivers/pci/quirks.c
>>>>> @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
>>>>>     * disable both L0s and L1 for now to be safe.
>>>>>     */
>>>>>    DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
>>>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd, quirk_disable_aspm_l0s_l1);
>>>>
>>>> L1 still got enabled
> 
> Is that based on the output below?
> 
>    [    5.445853] [     T48] pci 0000:00:00.0: Disabling ASPM L0s/L1
>    [    5.560448] [     T48] pci 0000:01:00.0: ASPM: default states L1
> 
> If so, this doesn't necessarily mean L1 was enabled.  It means the
> quirk marked the 00:00.0 Root Port so we shouldn't ever enable L0s or
> L1, and when we enumerated 01:00.0, we set its default ASPM state to
> L1.
> 
> But I don't *think* L1 should actually be enabled unless we can enable
> it for both 00:00.0 and 01:00.0, and the quirk should mean that we
> can't enable it for 00:00.0.

It's from the output of lspci -vv, even with the patch applied.

   LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
           ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt+

> 
> This muddle of "capable" (per Link Capabilities) vs "disabled" (either
> the Link Control shows disabled, or software said "don't ever use L1")
> is part of what makes aspm.c so confusing.
> 
>>>> The card works just fine. I'm thinking the ASPM issue is
>>>> probably from the glue driver reporting the link to be down when
>>>> it's really just in low power state.
>>>
>>> You're probably right, the meson_pcie_link_up() not only checks
>>> the LTSSM but also the speed, which is probably wrong.
>>>
>>> Can you try removing the test for speed ?
>>>
>>> -                 if (smlh_up && rdlh_up && ltssm_up && speed_okay)
>>> +                 if (smlh_up && rdlh_up && ltssm_up)
>>>
>>> The other drivers just checks the link, and some only the smlh_up
>>> && rdlh_up. So you can also probably drop ltssm_up aswell.
>>
>> I can confirm that removing the check for ltssm_up and speed_okay
>> made ASPM work.
> 
> I don't think meson_pcie_link_up() should have the loop in it, so the
> ltssm_up and speed_okay checks, the loop, the delay, and the timeout
> message should probably all be removed.  That method is supposed to be
> a simple true/false check, and any waiting required should be done in
> dw_pcie_wait_for_link().
> 
> The link was clearly up when we discovered 01:00.0, so the "wait
> linkup timeout" messages from meson_pcie_link_up() after that must be
> from dw_pcie_link_up() being called via the .map_bus() call in
> pci_generic_config_read() or pci_generic_config_write().
> 
> When meson_pcie_link_up() returns false in those config accessors,
> the config accesses will fail (they won't even be attempted), so we'll
> see things like this:
> 
>    pci 0000:01:00.0: BAR 0: error updating (0xfc700004 != 0xffffffff)
> 
> and "Unknown header type 7f" from lspci.
> 
> Can you drop the ASPM quirk patch and instead try the
> meson_pcie_link_up() patch below on top of v6.18-rc3?
> 

I have tested and can report that with the patch ASPM works out of the box.

>> We still need a solution to the original issue that's preventing the
>> controller from being initialized.
>>
>> My kernel has the following patch applied, but I think it's not
>> suitable for upstream as this changes device tree bindings for PCIe
>> controller on meson.
> 
> I assume the original issue is this:
> 
>    meson-pcie fc000000.pcie: error -EBUSY: can't request region for resource [mem 0xfc000000-0xfc3fffff]
> 
> and you confirmed that it wasn't fixed by a1978b692a39 ("PCI: dwc: Use
> custom pci_ops for root bus DBI vs ECAM config access"), which
> appeared in v6.18-rc3?
> 
> If it's still broken in v6.18-rc3, and the dtsi and
> meson_pcie_get_mems() patch below makes it work, we have more work to
> do, and maybe Krishna has some ideas.
> 



More information about the linux-amlogic mailing list