[LEDE-DEV] Transmit timeouts with mtk_eth_soc and MT7621

Kristian Evensen kristian.evensen at gmail.com
Thu Nov 9 08:21:10 PST 2017


On Thu, Nov 9, 2017 at 12:06 PM, Kristian Evensen
<kristian.evensen at gmail.com> wrote:
> I see that the CPU txds [384-511] have DDONE set and no SKB, while
> DDONE is not set for the DMA txds [0-383] and an skb is attached. I
> also looked at the content of the skb, and as far as I can see it is
> valid. Looking at the content of the SKB also shows that
> fe_reset_pending() does its job. For every timeout, there is a new set
> of packets on the ring. So new packets are put on the ring, but none
> are sent.

I have been hammering away on this issue during the day, and it seems
that the DMA engine, TX, etc. works just fine. However, for some
reason, the port with the router that has hung is able to stop the
whole switch. If I disable the port (or disconnect the cable), then TX
works again and I can for example reach 192.168.1.1 from 192.168.1.2
in my testbed. When running ping (from 192.168.1.2 to 192.168.1.1)
while disconnecting the cable, the first packets had a very high RTT
(~20ms). Running tcpdump showed that the reply arrived immediately, so
it seems the packets are stuck in a TX buffer for a really long time.
Could it be that there is a cache or something internally on the
switch that is causing packets to be held back, and that this cache is
invalidated and buffers flushed when I disable the port? I cleared
switch, DIP and SIP tables without any effect.

If I enable the port, then the problem appears again after a little
while (~30 seconds).

-Kristian



More information about the Lede-dev mailing list