[PATCH 2/2] phy: armada-38x: fix NETA lockup when repeatedly switching speeds

Gregory CLEMENT gregory.clement at bootlin.com
Mon Jul 13 11:36:54 EDT 2020


Hello,

> On 10-07-20, 16:19, Russell King - ARM Linux admin wrote:
>> On Wed, Jul 01, 2020 at 12:27:27PM +0530, Vinod Koul wrote:
>> > On 30-06-20, 17:05, Russell King wrote:
>> > > The mvneta hardware appears to lock up in various random ways when
>> > > repeatedly switching speeds between 1G and 2.5G, which involves
>> > > reprogramming the COMPHY.  It is not entirely clear why this happens,
>> > > but best guess is that reprogramming the COMPHY glitches mvneta clocks
>> > > causing the hardware to fail.  It seems that rebooting resolves the
>> > > failure, but not down/up cycling the interface alone.
>> > > 
>> > > Various other approaches have been tried, such as trying to cleanly
>> > > power down the COMPHY and then take it back through the power up
>> > > initialisation, but this does not seem to help.
>> > > 
>> > > It was finally noticed that u-boot's last step when configuring a
>> > > COMPHY for "SGMII" mode was to poke at a register described as
>> > > "GBE_CONFIGURATION_REG", which is undocumented in any external
>> > > documentation.  All that we have is the fact that u-boot sets a bit
>> > > corresponding to the "SGMII" lane at the end of COMPHY initialisation.
>> > > 
>> > > Experimentation shows that if we clear this bit prior to changing the
>> > > speed, and then set it afterwards, mvneta does not suffer this problem
>> > > on the SolidRun Clearfog when switching speeds between 1G and 2.5G.
>> > > 
>> > > This problem was found while script-testing phylink.
>> > > 
>> > > Signed-off-by: Russell King <rmk+kernel at armlinux.org.uk>
>> > > ---
>> > >  arch/arm/boot/dts/armada-38x.dtsi          |  3 +-
>> > 
>> > lgtm, i need ack for dts parts before I can apply this
>> 
>> I'm not sure what the situation is for Bootlin, but they don't seem to
>> be very responsive right now (covid related?)
>> 
>> What I know from what I've been party to on netdev is that Bootlin
>> sent a patch for the MVPP2 driver, and the very next day someone
>> reported that the patch caused a bug.  Unfortunately, the patch got
>> picked up anyway, but there was no response from Bootlin.  After a
>> month or so, -final was released containing this patch, so now it
>> had become a regression - and still no response from Bootlin.
>> 
>> Eventually the bug got fixed - not because Bootlin fixed it, but
>> because I ended up spending the time researching how that part of
>> the network driver worked, diagnosing what was going on, and
>> eventually fixing it in the most obvious way - but it's not clear
>> that the fix was the right approach.  Bootlin never commented.  See
>> 3138a07ce219 ("net: mvpp2: fix RX hashing for non-10G ports").
>> 
>> So, I think we have to assume that Bootlin are struggling right now,
>> and as it's been over a week, it's unlikely that they are going to
>> respond soon.  What do you think we should do?
>> 
>> I also note that Rob has not responded to the DT binding change
>> either, despite me gently prodding, and Rob processing a whole raft
>> of DT binding stuff yesterday.
>> 
>> I can split the DTS change from the rest of the patch, but I don't
>> think that really helps without at least the binding change being
>> agreed.
>
> I would prefer splitting, you may sent the DTS to arm arch folks if no
> response from subarch folks

Yes please could you split the patch to put the dts apart ? And if the
binding is accepted we will apply it.

Thanks,

Gregory


>
> -- 
> ~Vinod

-- 
Gregory Clement, Bootlin
Embedded Linux and Kernel engineering
http://bootlin.com



More information about the linux-arm-kernel mailing list