[PATCH RFC net-next v2 0/3] net: stmmac: approach 2 to solve EEE LPI reset issues

Jon Hunter jonathanh at nvidia.com
Mon Mar 10 07:20:45 PDT 2025


On 07/03/2025 17:07, Russell King (Oracle) wrote:
> On Fri, Mar 07, 2025 at 04:11:19PM +0000, Jon Hunter wrote:
>> Hi Russell,
>>
>> On 06/03/2025 15:23, Russell King (Oracle) wrote:
>>> Hi,
>>>
>>> This is a second approach to solving the STMMAC reset issues caused by
>>> the lack of receive clock from the PHY where the media is in low power
>>> mode with a PHY that supports receive clock-stop.
>>>
>>> The first approach centred around only addressing the issue in the
>>> resume path, but it seems to also happen when the platform glue module
>>> is removed and re-inserted (Jon - can you check whether that's also
>>> the case for you please?)
>>>
>>> As this is more targetted, I've dropped the patches from this series
>>> which move the call to phylink_resume(), so the link may still come
>>> up too early on resume - but that's something I also intend to fix.
>>>
>>> This is experimental - so I value test reports for this change.
>>
>>
>> The subject indicates 3 patches, but I only see 2 patches? Can you confirm
>> if there are 2 or 3?
> 
> Yes, 2 patches is correct.
> 
>> So far I have only tested to resume case with the 2 patches to make that
>> that is working but on Tegra186, which has been the most problematic, it is
>> not working reliably on top of next-20250305.
> 
> To confirm, you're seeing stmmac_reset() sporadically timing out on
> resume even with these patches appled? That's rather disappointing.

So I am no longer seeing the reset fail, from what I can see, but now
NFS is not responding after resume ...

[   49.825094] Enabling non-boot CPUs ...
[   49.829760] Detected PIPT I-cache on CPU1
[   49.832694] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU1: 0x0000009444c004
[   49.844120] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU1: 0x00000010305116
[   49.856231] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU1: 0x00000003001066
[   49.868081] CPU1: Booted secondary processor 0x0000000000 [0x4e0f0030]
[   49.875389] CPU1 is up
[   49.877187] Detected PIPT I-cache on CPU2
[   49.880824] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU2: 0x0000009444c004
[   49.892266] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU2: 0x00000010305116
[   49.904467] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU2: 0x00000003001066
[   49.916257] CPU2: Booted secondary processor 0x0000000001 [0x4e0f0030]
[   49.923610] CPU2 is up
[   49.925194] Detected PIPT I-cache on CPU3
[   49.929010] CPU3: Booted secondary processor 0x0000000101 [0x411fd073]
[   49.935866] CPU3 is up
[   49.937983] Detected PIPT I-cache on CPU4
[   49.941824] CPU4: Booted secondary processor 0x0000000102 [0x411fd073]
[   49.948593] CPU4 is up
[   49.950810] Detected PIPT I-cache on CPU5
[   49.954651] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
[   49.961431] CPU5 is up
[   50.069784] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[   50.077634] dwmac4: Master AXI performs any burst length
[   50.080718] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[   50.088172] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[   50.096851] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   50.110897] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[   50.113922] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[   50.147552] OOM killer enabled.
[   50.148441] Restarting tasks ... done.
[   50.152552] VDDIO_SDMMC3_AP: voltage operation not allowed
[   50.154761] random: crng reseeded on system resumption
[   50.162912] PM: suspend exit
[   50.212215] VDDIO_SDMMC3_AP: voltage operation not allowed
[   50.271578] VDDIO_SDMMC3_AP: voltage operation not allowed
[   50.338597] VDDIO_SDMMC3_AP: voltage operation not allowed
[  234.474848] nfs: server 10.26.51.252 not responding, still trying
[  234.538769] nfs: server 10.26.51.252 not responding, still trying
[  237.546922] nfs: server 10.26.51.252 not responding, still trying
[  254.762753] nfs: server 10.26.51.252 not responding, timed out
[  254.762771] nfs: server 10.26.51.252 not responding, timed out
[  254.766376] nfs: server 10.26.51.252 not responding, timed out
[  254.766392] nfs: server 10.26.51.252 not responding, timed out
[  254.783778] nfs: server 10.26.51.252 not responding, timed out
[  254.789582] nfs: server 10.26.51.252 not responding, timed out
[  254.795421] nfs: server 10.26.51.252 not responding, timed out
[  254.801193] nfs: server 10.26.51.252 not responding, timed out

> Do either of the two attached diffs make any difference?

I will try these next.

Thanks
Jon

-- 
nvpublic




More information about the linux-arm-kernel mailing list