phy: marvell: phy-mvebu-cp110-comphy: link failure and lockup built as module (=M)

Josua Mayer josua at solid-run.com
Sat Oct 25 07:21:17 PDT 2025


Hi all, please excuse my cross-posting phy topic to clk list, after debugging
it seems the more appropriate place.
The subject is drivers/clk/mvebu/cp110-system-controller.c:

I need some advice how to deal with below requirement:
IF a pci controller clock was active at the time of bootloader to Linux handover,
then the clock must not be stopped before the pci driver has completed probe.
In particular deferred probe must not allow the clock to stop intermittently.

Motivation:
There appears to be a situation on CN9130 SoC (and likely Armada 8040, too)
where clock framework stopping the pci clock before the pci driver has bound to it
causes pci to malfunction, and the system to lock up.
Note this is despite the pci driver starting the clock again!

On Armada 8040/CN9130 the bootloader handles pci configuration and link-up.
Later Linux pcie-armada8k driver just takes over the existing link.

On systems with all drivers builtin the pci driver probed early enough to take ownership
of the relevant clock before it could be turned off due to lack of users.

However once the comphy driver is set as module, which causes pci probe delay till after
rootfs is mounted, the pci clocks are stopped - and then later started again on pci probe.

Am 25.10.25 um 14:45 schrieb Josua Mayer:
> Dear Maintainers,
>
> I came across a bug srelating to cp110 comphy driver.
>
> On a board with CN9130 SoC + 2 external CPs Debian 13 freezes during boot,
> at some point after initramfs and kernel module loading has started.
>
> This occurs only when a pci card is present and had link-up from u-boot, e.g.:
>
> PCIE-0: Link up (Gen3-x4, Bus0)
> PCIE-12: Link up (Gen3-x1, Bus12)
>
> The issue is reproducible with a generic rootfs, kernel built with arm64 defconfig,
> no initramfs, but a single kernel configuration change:
>
> CONFIG_PHY_MVEBU_CP110_COMPHY=y -> m
>
> i.e. building the comphy driver as a module.
>
> The problem shows up usually by the console freezing during boot,
> before eventually the system watchdog hard resets SoC.
>
> [1] below shows the pci kernel messages during probe of the x4 port on my board,
> with the comphy driver builtin. After this log I reach login prompt and the system
> works as intended.
> [2] shows the pci messages for same port with comphy as a module. After the final
> line the systemd hard reset once watchdog expired.
> Both logs were captured on v6.12.48, but I did reproduce the problem with v6.18-rc1 too.
> For reference I am attaching full console logs as files to this mail.
>
> Most notably in the error case we get "Phy link never came up".
> Perhaps when comphy is a module some clock is stopped before pci probe starts ...
I added printk to drivers/clk/mvebu/cp110-system-controller.c on specifically
disabling an enabled clock - and compared results between good and bad case.

When comphy was a module, the following clock disable events were logged,
that do not occur with comphy builtin (network clocks filtered out):

[    2.981292] cp110-clk: disabling enabled clock "f6440000-pcie_x4"
[    3.014288] cp110-clk: disabling enabled clock "f6440000-sata-usb"
[    3.020502] cp110-clk: disabling enabled clock "f6440000-sata"
[    3.026364] cp110-clk: disabling enabled clock "f6440000-pcie_x11"
[    3.032576] cp110-clk: disabling enabled clock "f6440000-pcie_x10"
[    3.141640] cp110-clk: disabling enabled clock "f2440000-pcie_x4"

The last line with "f2440000-pcie_x4" affects the x4 port on CP0 that I have a card connected.

Later just before pci controller probe the respective is re-enabled:

[    7.830286] cp110-clk: enabling disabled clock "f2440000-pcie_x4"

but no link detected - and shortly after, the system freezes.

Inspired by this I hacked the clock driver further, to disarm disable function
for pci clocks only.

And - the system works perfectly again with comphy driver as a module (v6.18-rc1).

Adding in CC maintainers of cp110 clock driver for further advice ... (hope that is okay).

> and after pci probe function completes it locks up the controller that actually had a link,
> even though kernel driver did not detect.
>
> [1]
> [    1.995477] armada8k-pcie f2600000.pcie: armada8k_pcie_probe start
> [    2.104941] armada8k-pcie f2600000.pcie: host bridge /cp0/pcie at f2600000 ranges:
> [    2.112319] armada8k-pcie f2600000.pcie:      MEM 0x00c0000000..0x00dfefffff -> 0x00c0000000
> [    2.120841] armada8k-pcie f2600000.pcie: iATU: unroll F, 8 ob, 8 ib, align 64K, limit 4G
> [    2.232055] armada8k-pcie f2600000.pcie: PCIe Gen.3 x4 link up
> [    2.238103] armada8k-pcie f2600000.pcie: PCI host bridge to bus 0000:00
> [    2.244760] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    2.250282] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xdfefffff]
> [    2.257213] pci 0000:00:00.0: [11ab:0110] type 01 class 0x060400 PCIe Root Port
> [    2.264559] pci 0000:00:00.0: BAR 0 [mem 0x00000000-0x000fffff]
> [    2.270516] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> [    2.275773] pci 0000:00:00.0:   bridge window [mem 0xc0000000-0xc00fffff]
> [    2.282643] pci 0000:00:00.0: supports D1 D2
> [    2.286937] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> [    2.294292] pci 0000:01:00.0: [144d:a804] type 00 class 0x010802 PCIe Endpoint
> [    2.301669] pci 0000:01:00.0: BAR 0 [mem 0xc0000000-0xc0003fff 64bit]
> [    2.316143] pci 0000:00:00.0: BAR 0 [mem 0xc0000000-0xc00fffff]: assigned
> [    2.322972] pci 0000:00:00.0: bridge window [mem 0xc0100000-0xc01fffff]: assigned
> [    2.330495] pci 0000:01:00.0: BAR 0 [mem 0xc0100000-0xc0103fff 64bit]: assigned
> [    2.337887] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> [    2.343142] pci 0000:00:00.0:   bridge window [mem 0xc0100000-0xc01fffff]
> [    2.349968] pci_bus 0000:00: resource 4 [mem 0xc0000000-0xdfefffff]
> [    2.356267] pci_bus 0000:01: resource 1 [mem 0xc0100000-0xc01fffff]
> [    2.362778] pcieport 0000:00:00.0: PME: Signaling with IRQ 62
> [    2.368745] pcieport 0000:00:00.0: AER: enabled with IRQ 62
> [    2.374447] armada8k-pcie f2600000.pcie: armada8k_pcie_probe end
>
> [2]
> [   17.356469] armada8k-pcie f2600000.pcie: armada8k_pcie_probe start
> [   17.444656] armada8k-pcie f2600000.pcie: host bridge /cp0/pcie at f2600000 ranges:
> [   17.452042] armada8k-pcie f2600000.pcie:      MEM 0x00c0000000..0x00dfefffff -> 0x00c0000000
> [   17.460572] armada8k-pcie f2600000.pcie: iATU: unroll F, 8 ob, 8 ib, align 64K, limit 4G
> [   18.427718] armada8k-pcie f2600000.pcie: Phy link never came up
> [   18.433776] armada8k-pcie f2600000.pcie: PCI host bridge to bus 0006:00
> [   18.440445] pci_bus 0006:00: root bus resource [bus 00-ff]
> [   18.445968] pci_bus 0006:00: root bus resource [mem 0xc0000000-0xdfefffff]
>
> I am a bit lost here in how to further debug this - kindly share some ideas if you have any.
>
> sincerely
> Josua Mayer


More information about the linux-phy mailing list