[PATCH net-next 25/28] [RFC] net: dpaa: Convert to phylink

Fri Jun 17 17:45:38 PDT 2022

Hi Russell,

Thanks for the quick response.

On 6/17/22 6:01 PM, Russell King (Oracle) wrote:
> Hi,
> 
> On Fri, Jun 17, 2022 at 04:33:09PM -0400, Sean Anderson wrote:
>> This converts DPAA to phylink. For the moment, only MEMAC is converted.
>> This should work with no device tree modifications (including those made in
>> this series), except for QSGMII (as noted previously).
>>
>> One area where I wasn't sure how to do things was regarding when to call
>> phy_init and phy_power_on. Should that happen when selecting the PCS?
> 
> Is this a common serdes PHY that is shared amongst the various PCS? I
> think from what I understand having read the other patches, it is.

Each serdes has multiple lanes. There is a many-to-many relationship between
lanes and MACs. That is,

- One lane can service multiple MACs (QSGMII)
- One lane services a single MAC (SGMII, 10GBase-R, etc.)
- Multiple lanes may be used together (XAUI, HiGig, etc.) (these are not
   implemented (yet))

Each "group" of lanes corresponds to a struct phy. So in each of the above
scenarios, there would be one phy. Each PCS is a "protocol controller,"
which also corresponds to a "group" of lanes. Protocol controllers are usually
in a 1-to-many relationship with lanes (e.g. SGMIIA might be associated with
Lane A, and QSGMII A might also be associated with Lane A). The only exception
to this is the B4860 where there are some SGMII protocol controllers which can
be selected by two lanes (but not at the same time).

For Ethernet, protocol controller correspond to PCSs. Each MAC has a set of
PCSs, and an MDIO bus. Traditionally, the address for all PCSs is set to 0.
This would cause address collisions, so the serdes has to make sure to enable
only one PCS at once. It does this in pcs_set_mode.

> In which case, initialising the PHY prior to calling phylink_start() and
> powering down the PHY after phylink_stop() should be sufficient.

OK, that sounds reasonable.

>> Similarly, I wasn't sure where to reconfigure the thresholds in
>> dpaa_eth_cgr_init. Should happen in link_up? If so, I think we will need
>> some kind of callback.
> 
> Bear in mind that with 1000BASE-X, SGMII, etc, we need the link working
> in order for the link to come up, so if the serdes PHY hasn't been
> properly configured for the interface mode, then the link may not come
> up.
> 
> How granular are these threshold configurations? Do they depend on
> speed? (Note that SGMII operates at a constant speed irrespective of
> the data rate due to symbol replication, so there shouldn't be a speed
> component beyond that described by the interface mode, aka
> phy_interface_t.)

I believe these thresholds are for e.g. queue depths. So it shouldn't (TM)
matter what the depth is until the link comes up and we have to receive packets.
So I guess link up is the place? TBH I'm not terribly familiar with the QMan/BMan
half of the driver.

>> This has been tested on an LS1046ARDB. Without the serdes enabled,
>> everything works. With the serdes enabled, everything works but eth3 (aka
>> MAC6). On that interface, SGMII never completes AN for whatever reason. I
>> haven't tested the counterfactual (serdes enabled but no phylink). With
>> managed=phy (e.g. unspecified), I was unable to get the interfaces to come
>> up at all.
> 
> I'm not sure of the level of accurate detail in the above statement,
> so the following is just to cover all bases...

Just to clarify, I've tested

- Without phylink or serdes (e.g. stop at patch 21 or 24) (works)
- With phylink but no serdes (e.g. stop at patch 25) (works)
- With both phylink and serdes (e.g. everything applied) (eth3 broken)

But in this case I think it might be good to investigate e.g. patch 25 reverted.

> It's worth enabling debug in phylink so you can see what's going on -
> for example, whether the "MAC" (actually PCS today) is reporting that
> the link came up (via its pcs_get_state() callback.) Also whether
> phylib is reporting that the PHY is saying that the link is up. That
> should allow you to identify which part of the system is not

Yes, I've been using the debug prints in phylink extensively as part of
debugging :)

In this case, I added a debug statement to phylink_resolve printing out
cur_link_state, link_state.link, and pl->phy_state.link. I could see that
the phy link state was up and the mac (pcs) state was down. By inspecting
the PCS's registers, I determined that this was because AN had not completed
(in particular, the link was up in BMSR). I believe that forcing in-band-status
(by setting ovr_an_inband) shouldn't be necessary, but I was unable to get a link
up on any interface without it. In particular, the pre-phylink implementation
disabled PCS AN only for fixed links (which you can see in patch 23).

> Having looked through your phylink implementation, nothing obviously
> wrong stands out horribly in terms of how you're using it.
> 
> The only issue I've noticed is in dpaa_ioctl(), where you only forward
> one ioctl command to phylink, whereas there are actually three ioctls
> for PHY access - SIOCGMIIPHY, SIOCGMIIREG and SIOCSMIIREG. Note that
> phylink (and phylib) return -EOPNOTSUPP if the ioctl is not appropriate
> for them to handle. However, note that phylib will handle
> SIOCSHWTSTAMP.
> 

Ah, I'll make sure to fix that up.

--Sean