[PATCH RFC net-next 5/5] net: dsa: always use phylink for CPU and DSA ports

Thu Jul 7 08:48:12 PDT 2022

On Thu, Jul 07, 2022 at 06:27:27PM +0300, Vladimir Oltean wrote:
> On Thu, Jul 07, 2022 at 11:09:43AM +0100, Russell King (Oracle) wrote:
> > On Wed, Jul 06, 2022 at 05:24:09PM +0100, Russell King (Oracle) wrote:
> > > On Wed, Jul 06, 2022 at 01:26:21PM +0300, Vladimir Oltean wrote:
> > > > Can we please limit phylink_set_max_link_speed() to just the CPU ports
> > > > where a fixed-link property is also missing, not just a phy-handle?
> > > > Although to be entirely correct, we can also have MLO_AN_INBAND, which
> > > > wouldn't be covered by these 2 checks and would still represent a valid
> > > > DT binding.
> > > 
> > > phylink_set_max_fixed_link() already excludes itself:
> > > 
> > >         if (pl->cfg_link_an_mode != MLO_AN_PHY || pl->phydev || pl->sfp_bus)
>                                                       ~~~~~~~~~~
> 
> If not NULL, this is an SFP PHY, right? In other words, it's supposed to protect from
> phylink_sfp_connect_phy() - code involuntarily triggered by phylink_create() ->
> phylink_register_sfp() - and not from calls to phylink_{,fwnode_}connect_phy()
> that were initiated by the phylink user between phylink_create() and
> phylink_set_max_fixed_link(), correct? Those are specified as invalid in the
> kerneldoc and that's about it - that's not what the checking is for, correct?

No, it's not to do with sfps at all, but to do with enforcing the
pre-conditions for the function - that entire line is checking that
(a) we are in a sane state to be called, and (b) there is no
configuration initialisation beyond the default done by
phylink_create() - in other words, there is no in-band or fixed-link
specified.

Let's go through this step by step.

1. pl->cfg_link_an_mode != MLO_AN_PHY
   The default value for cfg_link_an_mode is MLO_AN_PHY. If it's
   anything other than that, then a fixed-link or in-band mode has
   been specified, and we don't want to override that. So this call
   needs to fail.

2. pl->phydev
   If a PHY has been attached, then the pre-condition for calling this
   function immediately after phylink_create() has been violated,
   because the only way it can be non-NULL is if someone's called one of
   the phylink functions that connects a PHY. Note: SFPs will not set
   their PHY here, because, for them to discover that there's a PHY, the
   network interface needs to be up, and it will never be up here... but
   in any case...

3. pl->sfp_bus
   If we have a SFP bus, then we definitely are not going to be
   operating in this default fixed-link mode - because the SFP is going
   to want to set its own parameters.

> > >                 return -EBUSY;
> > > 
> > > intentionally so that if there is anything specified for the port, be
> > > that a fixed link or in-band, then phylink_set_max_fixed_link() errors
> > > out with -EBUSY.
> > > 
> > > The only case that it can't detect is if there is a PHY that may be
> > > added to phylink at a later time, and that is what the check above
> > > is for.
> 
> Here by "PHY added at a later time", you do mean calling phylink_{,fwnode_}connect_phy()
> after phylink_set_max_fixed_link(), right?

Correct.

> So this is what I don't understand. If we've called phylink_set_max_fixed_link()
> we've changed pl->cfg_link_an_mode to MLO_AN_FIXED and this will
> silently break future calls to phylink_{,fwnode_}connect_phy(), so DSA
> predicts if it's going to call either of those connect_phy() functions,
> and calls phylink_set_max_fixed_link() only if it won't. Right?
> 
> You've structured the checks in this "distributed" way because phylink
> can't really predict whether phylink_{,fwnode_}connect_phy() will be
> called after phylink_set_max_fixed_link(), right? I mean, it can
> probably predict the fwnode_ variant, but not phylink_connect_phy, and
> this is why it is up to the caller to decide when to call and when not to.

phylink has no idea whether phylink_fwnode_connect_phy() will be called
with the same fwnode as phylink_create(), so it really can't make any
assumptions about whether there will be a PHY or not.

> 
> > I've updated the function description to mention this detail:
> > 
> > +/**
> > + * phylink_set_max_fixed_link() - set a fixed link configuration for phylink
> > + * @pl: a pointer to a &struct phylink returned from phylink_create()
> > + *
> > + * Set a maximum speed fixed-link configuration for the chosen interface mode
> > + * and MAC capabilities for the phylink instance if the instance has not
> > + * already been configured with a SFP, fixed link, or in-band AN mode. If the
> > + * interface mode is PHY_INTERFACE_MODE_NA, then search the supported
> > + * interfaces bitmap for the first interface that gives the fastest supported
> > + * speed.
> > 
> > Does this address your concern?
> > 
> > Thanks.
> 
> Not really, no, sorry, it just confuses me more.

I find that happens a lot when I try to add more documentation to
clarify things. I sometimes get to the point of deciding its better
_not_ to write any documentation, because documentation just ends up
being confusing and people want more and more detail.

I've got to the point in some case where I've had to spell out which
keys to press on the keyboard for people to formulate the "[PATCH ...]"
thing correctly, because if you put it in quotes, you get people who
will include the quotes even if you tell them not to.

I hate documentation, I seem incapable of writing it in a way people can
understand.

> It should maybe also
> say that this function shouldn't be called if phylink_{,fwnode_}connect_phy()
> is going to be called later.

It's already a precondition that phylink_{,fwnode_}connect_phy() fail if
we're in fixed-link mode (because PHYs have never been supported when in
fixed-link mode - if one remembers, the old fixed-link code used to
provide its own emulation of a PHY to make fixed-links work.) So PHYs
and fixed-links have always been mutually exclusive before phylink, and
continue to be so with phylink.

> Can phylink absorb all this logic, and automatically call phylink_set_max_fixed_link()
> based on the following?
> 
> (1) struct phylink_config gets extended with a bool fallback_max_fixed_link.
> (2) DSA CPU and DSA ports set this to true in dsa_port_phylink_register().
> (3) phylink_set_max_fixed_link() is hooked into this -ENODEV error
>     condition from phylink_fwnode_phy_connect():
> 
> 	phy_fwnode = fwnode_get_phy_node(fwnode);
> 	if (IS_ERR(phy_fwnode)) {
> 		if (pl->cfg_link_an_mode == MLO_AN_PHY)
> 			return -ENODEV; <- here
> 		return 0;
> 	}

My question in response would be - why should this DSA specific behaviour
be handled completely internally within phylink, when it's a DSA
specific behaviour? Why do we need boolean flags for this?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!