[LEDE-DEV] mt7621 cpu stalls - need help testing

Jaap Buurman jaapbuurman at gmail.com
Sat Jul 1 02:29:34 PDT 2017


Dear John,

I haven't had time to test it yet myself, but in the previously
mentioned topic by Bart van Zoest there have already been two reports
of crashing devices with SQM cake enabled unfortunately.

One of the posters provided a stack trace, which hopefully will be
useful in debugging the issue:


Applied blogic's patches to both current snapshot and 17.01, and
tested with both fq_codel and cake.
After fidling for a bit with the limits it looked like the bug was
fixed, but sadly no. About 300mbit ingress/egress does the trick, and
after ~15/20minutes and some heavy downloading, reboots and crashes
happen.

[ 945.720000] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 945.730000] 1-...: (21 GPs behind) idle=a22/0/0 softirq=25119/25120 fqs=1
[ 945.740000] (detected by 3, t=6004 jiffies, g=3848, c=3847, q=313)
[ 945.750000] Task dump for CPU 1:
[ 945.760000] swapper/1 R running 0 0 1 0x00100000
[ 945.770000] Stack : 00000000 87c4b180 000000dc ffffffff 000000c2
00000000 804db2a4 80490000
[ 945.770000] 8048874c 00000001 00000001 80488540 80488724 80490000
80490000 8000c0e0
[ 945.770000] 1100fc03 00000001 87c70000 87c71ec0 80490000 8000c410
1100fc03 00000001
[ 945.770000] 804db2a4 80490000 804db2a4 8005ed68 80490000 8001b2f8
1100fc03 00000000
[ 945.770000] 00000004 804884a0 000000a0 8001b300 c939c939 c939c939
c939c939 c939c939
[ 945.770000] ...
[ 945.840000] Call Trace:
[ 945.850000] [<8000be98>] __schedule+0x574/0x758
[ 945.860000] [<8000c0e0>] schedule+0x64/0x7c
[ 945.870000] [<8000c410>] schedule_preempt_disabled+0x10/0x1c
[ 945.880000] [<8005ed68>] cpu_startup_entry+0x11c/0x1b8
[ 945.890000] [<8001b300>] start_secondary+0x440/0x470
[ 945.900000]
[ 945.900000] rcu_sched kthread starved for 6019 jiffies! g3848 c3847
f0x0 s3 ->state=0x1


Hopefully this information will be helpful! I believe we are on the
right track, since with 300mbit it would normally crash within seconds
for me. For this poster it took 15-20 minutes, which is quite
impressive already.

Yours sincerely,

Jaap

On Fri, Jun 30, 2017 at 1:47 PM, bart van zoest <bartvanzoest at gmail.com> wrote:
> Hi John and the rest,
>
> This is great news! Hopefully this will solve the problems of the
> people owning mt7621 with SQM QoS!
> I have compiled a build with your revised patch for the D-Link
> DIR-860L B1 located at
> https://forum.lede-project.org/t/optimized-build-for-the-d-link-dir-860l/948
> for the people wanting to test it.
> Hopefully, there will be testers and feedback soon!
>
> Regards,
> Bart
>
> On Thu, Jun 29, 2017 at 8:56 PM, John Crispin <john at phrozen.org> wrote:
>>
>>
>> On 29/06/17 20:14, Jaap Buurman wrote:
>>>
>>> Dear John,
>>>
>>> This patch sounds very very promising! I will compile and test this
>>> first come this weekend. Thank you so very much for having a look at
>>> this issue :)
>>>
>>> Yours sincerely,
>>>
>>> Jaap
>>>
>>> _______________________________________________
>>> Lede-dev mailing list
>>> Lede-dev at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/lede-dev
>>
>> i pushed a bad patch, please use this instead
>>
>> https://git.lede-project.org/?p=lede/blogic/staging.git;a=commit;h=c05efda56aecea0f0f52a000a3ce271775b5fb24
>>
>> i'll provide a v4.9 version tomorrow
>>
>>     John
>>
>>
>> _______________________________________________
>> Lede-dev mailing list
>> Lede-dev at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/lede-dev



More information about the Lede-dev mailing list