[BUG] SFP I2C timeout forces link down with PHY_ERROR

Tue May 28 09:57:25 PDT 2024

Hi,

I saw the following warning [1] twice when testing 1000Base-T SFP
modules:

[ 1481.682501] cdns-i2c ff030000.i2c: timeout waiting on completion
[ 1481.692010] Marvell 88E1111 i2c:sfp-ge3:16: Master/Slave resolution failed
[ 1481.699910] ------------[ cut here ]------------
[ 1481.705459] phy_check_link_status+0x0/0xe8: returned: -67
[ 1481.711448] WARNING: CPU: 2 PID: 67 at drivers/net/phy/phy.c:1233 phy_state_machine+0xac/0x2ec
<snip>
[ 1481.904544] macb ff0c0000.ethernet net1: Link is Down

and a second time with some other errors too:

[   64.972751] cdns-i2c ff030000.i2c: xfer_size reg rollover. xfer aborted!
[   64.979478] cdns-i2c ff030000.i2c: xfer_size reg rollover. xfer aborted!
[   65.998108] cdns-i2c ff030000.i2c: timeout waiting on completion
[   66.010558] Marvell 88E1111 i2c:sfp-ge3:16: Master/Slave resolution failed
[   66.017856] ------------[ cut here ]------------
[   66.022786] phy_check_link_status+0x0/0xcc: returned: -67
[   66.028255] WARNING: CPU: 0 PID: 70 at drivers/net/phy/phy.c:1233 phy_state_machine+0xa4/0x2b8
<snip>
[   66.339533] macb ff0c0000.ethernet net1: Link is Down

The chain of events is:

- The I2C transaction times out for some reason (in the latter case due
  to a known hardware bug).
- mdio-i2c converts the error response to a 0xffff return value
- genphy_read_lpa sees that LPA_1000MSFAIL is set in MII_STAT1000 and
  returns -ENOLINK. This propagates up the calls stack.
- phy_check_link_status returns -ENOLINK
- phy_error_precise forces the link down with state = PHY_ERROR.

The problem with this is that although the register read fails due to a
temporary condition, the link goes down permanently (or at least until
the admin cycles the interface state).

I think some part of the stack should implement a retry mechanism, but
I'm not sure which part. One idea could be to have mdio-i2c propagate
negative errors instead of converting them to successful reads of
0xffff. But we would still need to handle that in the phy driver or in
phy_state_machine.

- Are I2C bus drivers supposed to be flaky like this? That is, are callers of
  i2c_transfer expected to handle the occasional spurious error?
- Similarly, are MDIO bus drivers allowed to be flaky?
- Is ETIMEDOUT even supposed to be recoverable? Maybe we should have
  cdns-i2c return EAGAIN instead so it gets retried by the bus
  arbitration logic in __i2c_transfer.
- ENOLINK really seems like something which we could recover from by
  resetting the phy (or even just waiting a bit). Maybe we should have
  the phy state machine just switch to PHY_NOLINK?

Of course, the best option would be to fix cdns-i2c to not be buggy, but
the hardware itself is buggy in at least one of the above cases so that
may not be practical.

--Sean