net: stmmac: dwmac-meson8b: interface sometimes does not come up at boot

Jerome Brunet jbrunet at baylibre.com
Mon Jun 13 02:10:19 PDT 2022


On Sat 11 Jun 2022 at 17:00, Da Xue <da at lessconfused.com> wrote:

> On Wed, Mar 9, 2022 at 3:42 PM Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>
>  On 09.03.2022 15:57, Jerome Brunet wrote:
>  > 
>  > On Wed 09 Mar 2022 at 15:45, Erico Nunes <nunes.erico at gmail.com> wrote:
>  > 
>  >> On Sun, Mar 6, 2022 at 1:56 PM Heiner Kallweit <hkallweit1 at gmail.com> wrote:
>  >>> You could try the following (quick and dirty) test patch that fully mimics
>  >>> the vendor driver as found here:
>  >>> https://github.com/khadas/linux/blob/buildroot-aml-4.9/drivers/amlogic/ethernet/phy/amlogic.c
>  >>>
>  >>> First apply
>  >>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=a502a8f04097e038c3daa16c5202a9538116d563
>  >>> This patch is in the net tree currently and should show up in linux-next
>  >>> beginning of the week.
>  >>>
>  >>> On top please apply the following (it includes the test patch your working with).
>  >>
>  >> I triggered test jobs with this configuration (latest mainline +
>  >> a502a8f0409 + test patch for vendor driver behaviour), and the results
>  >> are pretty much the same as with the previous test patch from this
>  >> thread only.
>  >> That is, I never got the issue with non-functional link up anymore,
>  >> but I get the (rare) issue with link not going up.
>  >> The reproducibility is still extremely low, in the >1% range.
>  > 
>  > Low reproducibility means the problem is still there, or at least not
>  > understood completly.
>  > 
>  > I understand the benefit from the user standpoint.
>  > 
>  > Heiner if you are going to continue from the test patch you sent,
>  > I would welcome some explanation with each of the changes.
>  > 
>  The latest test patch was purely for checking whether we see any
>  difference in behavior between vendor driver and the mainlined
>  version. It's in no way meant to be applied to mainline.
>
>  > We know very little about this IP and I'm not very confortable with
>  > tweaking/aligning with AML sdk "blindly" on a driver that has otherwise
>  > been working well so far.
>  > 
>
>  This touches one thing I wanted to ask anyway: Supposedly Amlogic
>  didn't develop an own Ethernet PHY, and if they licensed an existing
>  IP then it should be similar to some other existing PHY (that may
>  have a driver in phylib).
>
>  Then what I'll do is submit the following small change that brought
>  the error rate significantly down according to Erico's tests.
>
>  -       phy_trigger_machine(phydev);
>  +       if (irq_status & INTSRC_ANEG_COMPLETE)
>  +               phy_queue_state_machine(phydev, msecs_to_jiffies(100));
>  +       else
>  +               phy_trigger_machine(phydev);
>
>  > Thx
>  > 
>  >>
>  >> So at this point, I'm not sure how much more effort to invest into
>  >> this. Given the rate is very low and the fallback is it will just
>  >> reset the link and proceed to work, I think the situation would
>  >> already be much better with the solution from that test patch being
>  >> merged. If you propose that as a patch separately, I'm happy to test
>  >> the final submitted patch again and provide feedback there. Or if
>  >> there is another solution to try, I can try with that too.
>  >>
>  >> Thanks
>  >>
>  >>
>  >> Erico
>  > 
>
>  Heiner
>
> To help reproduce this problem, I have had this problem for as long as I can remember and it still occurs with this patch.

Same here, on both gxl and g12a. Occurrence remains unchanged.
The is even reproduced if the PHY is switched to polling mode so the
merged change, related to the IRQ handling, is very unlikely to fix the
problem.

>
> This doesn't happen on first boot most of the time. It happens on reboot consistently. I have tested with AML-S805X-CC board, AML-S905X-CC V1, and V2 boards.
>

On my side, I confirm the network never seems to get stuck in u-boot but
it might break in Linux, even on the first boot after a power up from
what I have seen so far.

> I am on u-boot 22.04 with 5.18.3 which includes the patch.
> u-boot brings up ethernet on start and can grab an IP.
> Linux brings up ethernet and can grab an IP.
> reboot
> u-boot can grab an IP.
> Linux does not get anything. 
> I have to do ip link set dev eth0 down && up once or more to get ethernet to work again.
> Sometimes it spams meson8b-dwmac c9410000.ethernet eth0: Reset adapter. If it spams this, ethernet is dead and can't be recovered.

I tried several things, none showing any improvement so far
* Make sure LPI/EEE is disabled
* Add the ethernet reset from the main controller on the MAC
* Test the various DMA modes of STMMAC
* Port the differences from u-boot and the vendor kernel in the Phy driver

I have also tried to go back in time, up to v4.19 but the problem is actually
already there. It occurs at lot less though.
Since v5.6+ the occurence is quite high: approx 1 in 4 boots
On v4.19: 1 in 50 boots - up to 150.

>

When the problem happen
* link is reported up
* ifconfig / MAC is claiming to be sending packets (Tx increasing - no Rx)
* I see no traffic with wireshark

The packets are getting lost somewhere. Can't say for sure if it is in
the MAC or the PHY.

> This is fixed via power cycle so I'm assuming some register is not reset or maybe the IP is stuck.
>

`ethtool -r eth0` also seems to work around the problem.
This trigs the restart of so many things, it is close to an un/replug of
the ethernet cable :/

> Best,
> Da Xue




More information about the Linux-rockchip mailing list