[LEDE-DEV] Transmit timeouts with mtk_eth_soc and MT7621
John Crispin
john at phrozen.org
Sat Aug 19 14:52:13 PDT 2017
On 19/08/17 23:13, Kristian Evensen wrote:
> Hi both,
>
> On Sat, 19 Aug 2017 at 20:16, John Crispin <john at phrozen.org
> <mailto:john at phrozen.org>> wrote:
>
> Hi All,
>
> i have a staged commit on my laptop that makes all the (upstream)
> ethernet fixes that i pushed to mt7623 work on mt7621. please hang on
> for a few more days till i finished testing the support. this will add
> latest upstream ethernet support + DSA
>
>
> Thanks for the follow-up Mingyu and the info John. I have not had time
> to investigate the issue further (holiday backlog ...), but will start
> working on trying to reproduce it at the end of next week. I have
> deployed the patch to some routers and have not seen any regressions,
> but I would like to know how to reliably trigger the issue before
> concluding :)
>
> John, does your commits include a fix similar to what Mingyu sent me?
with my fixes the mt7623 passes a 48h stress test running the unit on a
iperf test with 200 parallel flows at full wire speed. once backported
to mt7621 i am pretty confident that the fix will yield the maximum
stable performance we can get.
John
>
> Kristian
>
>
>
> John
>
>
> On 19/08/17 17:06, Mingyu Li wrote:
> > Hi Kristian.
> >
> > does this patch works?
> >
> > 2017-07-24 23:45 GMT+08:00 Mingyu Li <igvtee at gmail.com
> <mailto:igvtee at gmail.com>>:
> >> i guess more other interrupts maybe cause the problem. because the
> >> ethernet receive flow is interrupt by other hardware. so use sd
> card,
> >> wifi or usb can generate interrupts.
> >>
> >> 2017-07-24 17:19 GMT+08:00 Kristian Evensen
> <kristian.evensen at gmail.com <mailto:kristian.evensen at gmail.com>>:
> >>> Hi,
> >>>
> >>> On Mon, Jul 24, 2017 at 4:02 AM, Mingyu Li <igvtee at gmail.com
> <mailto:igvtee at gmail.com>> wrote:
> >>>> i guest the problem is there are some tx data not free. but tx
> >>>> interrupt is clean. cause tx timeout. the old code will free data
> >>>> first then clean interrupt. but there maybe new data arrive
> after free
> >>>> data before clean interrupt.
> >>>> so change it to clean interrupt first then clean all tx data(
> also
> >>>> remove the budget limit). if new tx data arrive. hardware
> will set tx
> >>>> interrupt flag. then we will free it next time.
> >>>> i also apply this to rx flow.
> >>> Thanks for the detailed explanation. I have deployed an image
> with the
> >>> patch to some of the routers showing this issue, so lets wait
> and see.
> >>> Of course, all routers have been stable for the last couple of
> days
> >>> (including before the weekend) now, so I will let them run for
> a week
> >>> or so and then report back.
> >>>
> >>> In order to ease testing and make it more controlled, do you
> have any
> >>> suggestions for how to trigger the error? Is it "just" a
> timing issue
> >>> or should I be able to trigger it with for example a specific
> traffic
> >>> pattern?
> >>>
> >>> -Kristian
> > _______________________________________________
> > Lede-dev mailing list
> > Lede-dev at lists.infradead.org <mailto:Lede-dev at lists.infradead.org>
> > http://lists.infradead.org/mailman/listinfo/lede-dev
>
More information about the Lede-dev
mailing list