[PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

Mon Aug 21 15:48:34 PDT 2023

On Mon, Aug 21, 2023 at 05:06:46PM -0400, Sean Anderson wrote:
> On 8/21/23 15:58, Vladimir Oltean wrote:
> > On Mon, Aug 21, 2023 at 02:46:53PM -0400, Sean Anderson wrote:
> >> After further review, it seems the reason 28g can get away without this
> >> is because there's a one-to-one mapping between protocol controllers and
> >> lanes. Unfortunately, that regularity is not present for 10g.
> >> 
> >> --Sean
> > 
> > There are some things I saw in your phy-fsl-lynx-10g.c driver and device
> > tree bindings that I don't understand (the concept of lane groups)
> 
> Each lane group corresponds to a phy device (struct phy). For protocols
> like PCIe or XAUI which can use multiple lanes, this lets the driver
> coordinate configuring all lanes at once in the correct order.

For PCIe I admit that I don't know. I don't even know if there's any
valid (current or future) use case for having the PCIe controller have a
phandle to the SerDes. Everything else below in my reply assumes Ethernet.

> > and
> > I'm not sure if they're related with what you're saying here, so if you
> > could elaborate a bit more (ideally with an example) on the one-to-one
> > mapping and the specific problems it causes, it would be great.
> 
> For e.g. the LS2088A, SerDes1 lanes H-A use SG1-8 and XFI1-8. SerDes2
> lanes A-H use SG9-16 and XFI9-16. Each lane has its own controller, and
> the mapping is 1-to-1. In the PCCRs, each controller uses the same
> value, and is mapped in a regular way. So you can go directly from the
> lane number to the right value to mask in the PCCR, with a very simple
> translation scheme.
> 
> For e.g. the LS1046A, SerDes1 lane D uses XFI.9 (aka XFIA) and lane C
> uses XFI.10 (aka XFIB). This is the opposite of how SerDes1 lanes A-D
> use SGMII.9, .10, .5, and .6 (aka SGMIIA-D).
> 
> For e.g. the T4240, SerDes1 lanes A-H use sg1.5, .6, .10, .9, .1, .2,
> .3, .4 (aka SGMII E, F, H, G, A, B, C, D).
> 
> For e.g. the B4860, SerDes lanes A uses sgmii1 or sgmii5 and B uses
> sgmii2 or sgmii6. The MAC selected is determined based on the value
> programmed into PCCR2.

It's so exceedingly unlikely that B4860 will gain support for anything
new, that it's not even worth talking about it, or even considering it
in the design of a new driver. Just forget about it.

Let's concentrate on the Layerscapes, and on the T series to the extent
that we're not going out of our way to support them with a fairly simple
design.

In the Lynx 10G block guide that I have, PCCR2 is a register that does
something completely different from Ethernet. I'm not sure if B4860 has
a Lynx 10G block and not something else.

> While I appreciate that your hardware engineers did a better job for
> 28g, many 10g serdes arbitrarily map lanes to protocol controllers.
> I think the mapping is too irregular to tame, and it is better to say
> "if you want this configuration, program this value".

Ok, but that's a lateral argument (or I'm not understanding the connection).

Some maintainers (Mark Brown for sure, from my personal experience) prefer
that expert-level knowledge of hardware should be hardcoded into the kernel
driver based on known stuff such as the SoC-specific compatible string.
I certainly share the same view.

With your case, I think that argument is even more pertinent, because
IIUC, the lane group protocols won't be put in the SoC .dtsi (so as to
be written only once), but in the board device trees (since the
available protocols invariably depend upon the board provisioning).
So, non-expert board device tree writers will have to know what's with
the PCCR stuff. Pretty brain-intensive.

> > I may be off with my understanding of the regularity you are talking about,
> > but the LX2160 (and Lynx 28G block) also has multi-lane protocols like 40G,
> > 100G, assuming that's what you are talking about. I haven't started yet
> > working on those for the mtip_backplane driver, but I'm not currently
> > seeing a problem with the architecture where a phy_device represents a
> > single lane that's part of a multi-lane port, and not an entire group.
> 
> Resetting one lane in a group will reset the rest, which could confuse
> the driver. Additionally, treating the lanes as one phy lets us set the
> reset direction and first lane bits correctly.

Yeah, in theory that is probably correct, but in practice can't we hide
our head in the sand and say that the "phys" phandles have to have the
lanes in the same order as LNmGCR0[1STLANE] expects them (which is also
the "natural" order as the SoC RM describes it)? With a "for" loop
implementation in the MAC, that would work just fine 100% of the time.
Doing more intricate massaging of the "phys" in the consumer, and you're
just asking for trouble. My 2 cents.

Sure, it's the same kind of ask of a board device tree writer as "hey,
please give me a good PCCR value", but I honestly think that the head
scratching will be much less severe.

> > In my imagination, there are 2 cases:
> > - all 4 lanes are managed by the single dpaa2-mac consumer (which has 4
> >   phandles, and iterates over them with a "for" loop)
> > - each of the 4 lanes is managed by the respective backplane AN/LT core,
> >   and thus, there's one phandle to each lane
> 
> By doing the grouping in the driver, we also simplify the consumer
> implementation. The MAC can always use a single phy, without worrying
> about the actual number of lanes. This matches the hardware, since the
> MAC is going to talk XGMII (or whatever) to the protocol controller
> anyway.

XGMII is the link between the MAC and the PCS, but the "phys" phandle to
the SerDes gives insight to the MAC driver way beyond the PCS layer.
That kinda invalidates the idea that "you don't need to worry about the
actual number of lanes". When you're a MAC driver with an XLAUI link and
you need insight into the PMA/PMD layer, you'd better not be surprised
about the fact that there are 4 lanes, at the very least?

> I think it will be a lot easier to add multi-lane support with this
> method because it gives the driver more information about what's going
> on. The driver can control the whole configuration/reset process and the
> timing.

Also, I'm thinking that if you support multi-lane (which dpaa2-mac currently
doesn't, even in LSDK), you can't avoid multiple "phys" phandles in dpaa2-mac,
and a "for" loop, eventually, anyway. That's because if your lanes have protocol
retimers on them, those are going to be modeled as generic PHYs too. And those
will not be bundled in these "groups", because they might be one chip per lane.

The retimer thing isn't theoretical, but, due to reasons independent of NXP,
we lack the ability to provide an upstream user of the "lane retimer as
generic PHY" functionality in dpaa2-mac. So it stays downstream for now.
https://github.com/nxp-qoriq/linux/commit/627c5f626a13657f46f68b90882f329310e0e22f

So, if you're thinking of easing the work of the consumer side, I'm not
sure that the gains will be that high.

Otherwise, I will repeat the idea that lynx-10g and lynx-28g should be
treated in unison, because some drivers (dpaa2-mac, mtip_backplane) will
have to interface with both, and I don't really believe that major deviations
in software architecture between the 2 SerDes drivers are justifiable in
any way (multi-protocol handled differently, for example).