[PATCH] arm64: dts: imx8mp-debix-model-a: Disable EEE for 1000T

Emanuele Ghidoli ghidoliemanuele at gmail.com
Mon Oct 27 01:47:53 PDT 2025



On 27/10/2025 08:27, Laurent Pinchart wrote:
> Hi Andrew,
> 
> Thank you for your quick reply.
> 
> On Mon, Oct 27, 2025 at 04:08:42AM +0100, Andrew Lunn wrote:
>> Adding Russell King
>>
>> On Sun, Oct 26, 2025 at 02:29:04PM +0200, Laurent Pinchart wrote:
>>> Energy Efficient Ethernet (EEE) is broken at least for 1000T on the EQOS
>>> (DWMAC) interface. When connected to an EEE-enabled peer, the ethernet
>>> devices produces an interrupts storm. Disable EEE support to fix it.
>>>
>>> Signed-off-by: Laurent Pinchart <laurent.pinchart at ideasonboard.com>
>>> ---
>>> The exact reason for the interrupt storm is unknown, and my attempts to
>>> diagnose it was hindered by my lack of expertise with DWMAC. As far as I
>>> understand, the DWMAC implements EEE support, and so does the RTL8211E
>>> PHY according to its datasheet.
>>
>> I believe for DWMAC it is a synthesis option. However, there is a bit
>> indicating if the hardware supports it.
>>
>> The PHY should not be able to trigger an interrupt storm in the
>> MAC. So this is likely to be an DWMAC issue.
>>
>> Which interrupt bit is causing the storm?
> 
> That's where I hit my first wall :-)
> 
> I've tried to diagnose the issue by adding interrupt counters to
> dwmac4_irq_status(), counting interrupts for each bit of GMAC_INT_STATUS
> (0x00b0). Bit RGSMIIIS (0) is the only one that seems linked to the
> interrupts storm, increasing at around 10k per second. However, the
> corresponding bit in GMAC_INT_EN (0x00b4) is *not* set.
> 
> The ENET_EQOS interrupt on the i.MX8MP is an OR'ed signal that combines
> four interrupt sources:
> 
> - ENET QOS TSN LPI RX exit Interrupt
> - ENET QOS TSN Host System Interrupt
> - ENET QOS TSN Host System RX Channel Interrupts
> - ENET QOS TSN Host System TX Channel Interrupts
> 
> The last two interrupt sources are themselves local OR of channels[4:0].
> 
> I ould suspect that the LPI RX exit interrupt is the one that fires
> constantly given its name, but I'm not sure how to test that.
> 
>>> What each side does exactly is unknown
>>> to me. One theory I've heard to explain the issue is that the two
>>> implementations conflict. There is no register in the RTL8211E PHY to
>>> disable EEE on the PHY side while still advertising its support to the
>>> peer and relying on the implementation in the DWMAC (if this even makes
>>> sense)
>>
>> It does not make sense. EEE is split into two major parts. The two
>> PHYs communicate with each other to negotiate the feature, if both
>> ends support it and both ends want to use it. The result of this
>> negotiation is then passed to the MACs.
>>
>> It is then the MAC who decides when to send a Low Power Indication to
>> the PHY to tell the PHY to enter low power mode. The MAC also wakes
>> the PHY when it has packets to send.
>>
>> A quick look at the data sheet for the RTL8211E suggests this is what
>> is supports.
>>
>> There are a few PHYs which implement SmartEEE, or some other similar
>> name. They operate differently, the PHY does it all, and the MAC is
>> not even aware EEE is happening. Such PHYs should really only be
>> paired with MACs which do not support EEE. An EEE capable MAC paired
>> with a SmartEEE PHY could have problems, but hopefully the EEE
>> abilities and negotiation registers in the PHY would be sufficient to
>> dissuade the MAC from doing EEE. But i would not expect a setup like
>> this to trigger an interrupt storm.
> 
> Thanks for the explanation, I read documents to try and figure out how
> it worked and didn't find such a clear and concise high-level summary.
> 
> I'm not very experienced with ethernet, but I can easily test patches or
> even rough ideas on hardware.
> 

Hi Laurent,
I had the same problem, interrupt storm plus link instability with dwmac.

I found out that 2c81f3357136 ("net: stmmac: convert to phylink PCS support")
commit is the one causing the problem to me.

But the phy used by our board clearly do not support EEE, so I disabled
directly in the driver.

I’m very interested in your investigation, as I’d like to understand why that
commit causes a regression, given that it supposedly just switches to using
phylink for EEE management.

See https://lore.kernel.org/all/20251023144857.529566-1-ghidoliemanuele@gmail.com/

Thanks and regards,
Emanuele



More information about the linux-arm-kernel mailing list