[LEDE-DEV] stall/hang in netifd on LEDE r1318 on Linksys WRT1900AC V1

Syrone Wong wong.syrone at gmail.com
Wed Aug 17 16:39:45 PDT 2016


Hello Josua, pat,

I haven't tested this yet. Thanks for your effort.

Are you sure this is the root cause? Everything works well in the past
without this config being enabled.

If you say yes, please send a PR or send a patch to mailing list.




Best Regards,
Syrone Wong


On Thu, Aug 18, 2016 at 3:16 AM, pat <pat at patfruth.com> wrote:
> Hello Josua,
>
> To my great satisfaction, your suggestion works beautifully !!
>
> Thank you very much for taking time to pass this suggestion along.
>
> Pat
>
>
>
>> On Aug 17, 2016, at 9:08 AM, Josua Mayer <josua.mayer97 at gmail.com> wrote:
>>
>> Hi Syrone, Pat,
>>
>> I ran into the same issue on the Clearfog Pro, and my colleague figured
>> out a way to make it disappear.
>> Please try out this patch and let me know if it helps on your boards:
>> https://github.com/Artox/lede-project/commit/db724f8ff1ed4c77668f691ed4d066a8e0f2693e
>>
>> From db724f8ff1ed4c77668f691ed4d066a8e0f2693e Mon Sep 17 00:00:00 2001
>> From: Josua Mayer <josua.mayer97 at gmail.com>
>> Date: Wed, 17 Aug 2016 16:42:07 +0200
>> Subject: [PATCH] mvebu: enable cpu hotplug support in kernel
>>
>> This option prevents the rcu stalls in mvneta on armada-38x.
>>
>> Signed-off-by: Josua Mayer <josua.mayer97 at gmail.com>
>> ---
>> target/linux/mvebu/config-4.4 | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/target/linux/mvebu/config-4.4 b/target/linux/mvebu/config-4.4
>> index d0f042e..6c4ff70 100644
>> --- a/target/linux/mvebu/config-4.4
>> +++ b/target/linux/mvebu/config-4.4
>> @@ -209,6 +209,7 @@ CONFIG_HAVE_UID16=y
>> CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
>> CONFIG_HIGHMEM=y
>> # CONFIG_HIGHPTE is not set
>> +CONFIG_HOTPLUG_CPU=y
>> CONFIG_HWBM=y
>> CONFIG_HWMON=y
>> CONFIG_HZ_FIXED=0
>> --
>> 2.6.6
>>
>>
>> Am 15.08.2016 um 08:57 schrieb Syrone Wong:
>>> I have the same issue. Everything works well on
>>> https://github.com/lede-project/source/commit/22ef1c83b35cd5633b0c58c9c38a43494a906a6a,
>>> boot hang when compiling
>>> https://github.com/lede-project/source/commit/b9b665ae49469a73d254b1a219a4a7c4e22f27c0
>>> last night.
>>>
>>> I'm too lazy to attach TTL cable, then I revert to the older version.
>>>
>>> I hope my information help.
>>>
>>> Best Regards,
>>> Syrone Wong
>>>
>>>
>>> On Mon, Aug 15, 2016 at 2:27 PM, pat <pat at patfruth.com> wrote:
>>>> Dear LEDE devs,
>>>>
>>>> There doesn’t appear to be an LEDE forum yet, else I’d post it on the forum.  So I’m hoping someone on the mail list has a suggestion here.
>>>>
>>>> I’ve been running a build of OpenWRT DD R49195 since earlier this year.
>>>> I thought I’d try to move to LEDE.
>>>>
>>>> To that end, I’ve just built an image based on LEDE r1318 for a Linksys WRT1900AC V1 (aka mamba).
>>>> The build completed successfully, and the image appears to have flashed successfully.
>>>> Upon booting, the boot process stalls/hangs.  I see the following in a serial/tty console;
>>>>
>>>> .
>>>>>>>> ….
>>>> [   16.641909] device eth0 entered promiscuous mode
>>>> [   16.649345] IPv6: ADDRCONF(NETDEV_UP): br-lan: link is not ready
>>>> [   76.692269] INFO: rcu_sched self-detected stall on CPU
>>>> [   76.697461]  1-...: (6000 ticks this GP) idle=4b7/140000000000001/0 softirq=902/919 fqs=5980
>>>> [   76.702321] INFO: rcu_sched detected stalls on CPUs/tasks:
>>>> [   76.702341]  1-...: (6000 ticks this GP) idle=4b7/140000000000001/0 softirq=902/919 fqs=5980
>>>> [   76.702352]  (detected by 0, t=6002 jiffies, g=111, c=110, q=990)
>>>> [   76.702356] Task dump for CPU 1:
>>>> [   76.702368] netifd          R running      0  1106      1 0x00000002
>>>> [   76.702401] [<c00101ec>] (__schedule) from [<c00da48c>] (SyS_ioctl+0x34/0x5c)
>>>> [   76.702418] [<c00da48c>] (SyS_ioctl) from [<c0009c80>] (ret_fast_syscall+0x0/0x3c)
>>>> [   76.750523]   (t=6006 jiffies g=111 c=110 q=990)
>>>> [   76.755178] Task dump for CPU 1:
>>>> [   76.758420] netifd          R running      0  1106      1 0x00000002
>>>> [   76.764838] [<c001fa3c>] (unwind_backtrace) from [<c001c3a4>] (show_stack+0x10/0x14)
>>>> [   76.772617] [<c001c3a4>] (show_stack) from [<c006b3b8>] (rcu_dump_cpu_stacks+0x78/0xb0)
>>>> [   76.780648] [<c006b3b8>] (rcu_dump_cpu_stacks) from [<c006e7c0>] (rcu_check_callbacks+0x28c/0x754)
>>>> [   76.789637] [<c006e7c0>] (rcu_check_callbacks) from [<c00708dc>] (update_process_times+0x38/0x64)
>>>> [   76.798543] [<c00708dc>] (update_process_times) from [<c007f738>] (tick_sched_timer+0x21c/0x260)
>>>> [   76.807358] [<c007f738>] (tick_sched_timer) from [<c0071694>] (__hrtimer_run_queues+0xf8/0x1b8)
>>>> [   76.816084] [<c0071694>] (__hrtimer_run_queues) from [<c00718ac>] (hrtimer_interrupt+0xac/0x200)
>>>> [   76.824898] [<c00718ac>] (hrtimer_interrupt) from [<c02edac0>] (armada_370_xp_timer_interrupt+0x30/0x38)
>>>> [   76.834407] [<c02edac0>] (armada_370_xp_timer_interrupt) from [<c00664f0>] (handle_percpu_devid_irq+0x6c/0x84)
>>>> [   76.844447] [<c00664f0>] (handle_percpu_devid_irq) from [<c00623c0>] (generic_handle_irq+0x24/0x34)
>>>> [   76.853521] [<c00623c0>] (generic_handle_irq) from [<c0062698>] (__handle_domain_irq+0x98/0xac)
>>>> [   76.862247] [<c0062698>] (__handle_domain_irq) from [<c0009428>] (armada_370_xp_handle_irq+0x50/0xb0)
>>>> [   76.871496] [<c0009428>] (armada_370_xp_handle_irq) from [<c000a5f4>] (__irq_svc+0x54/0x70)
>>>> [   76.879869] Exception stack(0xce1b5de8 to 0xce1b5e30)
>>>> [   76.884938] 5de0:                   00000000 cf83fca4 00000000 cf83eca4 c05c6614 cf83fca4
>>>> [   76.893141] 5e00: cf83fc80 cf83f830 00000000 00000000 00000000 00000000 cf83eca8 ce1b5e38
>>>> [   76.901341] 5e20: c0028e20 c0042198 a0000013 ffffffff
>>>> [   76.906417] [<c000a5f4>] (__irq_svc) from [<c0042198>] (raw_notifier_chain_register+0x10/0x40)
>>>> [   76.915064] [<c0042198>] (raw_notifier_chain_register) from [<c0028e20>] (register_cpu_notifier+0x28/0x3c)
>>>> [   76.924759] [<c0028e20>] (register_cpu_notifier) from [<c028d29c>] (mvneta_open+0xb8/0x170)
>>>> [   76.933147] [<c028d29c>] (mvneta_open) from [<c03164a0>] (__dev_open+0x8c/0x108)
>>>> [   76.940569] [<c03164a0>] (__dev_open) from [<c0316754>] (__dev_change_flags+0xb0/0x140)
>>>> [   76.948597] [<c0316754>] (__dev_change_flags) from [<c03167fc>] (dev_change_flags+0x18/0x48)
>>>> [   76.957072] [<c03167fc>] (dev_change_flags) from [<c032b4a8>] (dev_ifsioc+0xd0/0x320)
>>>> [   76.964928] [<c032b4a8>] (dev_ifsioc) from [<c032bf60>] (dev_ioctl+0x7f4/0x8c0)
>>>> [   76.972266] [<c032bf60>] (dev_ioctl) from [<c00da408>] (do_vfs_ioctl+0x6a4/0x6f4)
>>>> [   76.979772] [<c00da408>] (do_vfs_ioctl) from [<c00da48c>] (SyS_ioctl+0x34/0x5c)
>>>> [   76.987106] [<c00da48c>] (SyS_ioctl) from [<c0009c80>] (ret_fast_syscall+0x0/0x3c)
>>>>
>>>> It seems netifd is hung (maybe in the mvneta driver)????
>>>> Has anyone else already seen this?
>>>> What is causing this?
>>>> How do I fix it?
>>>>
>>>> Thanks
>>>>
>>>>
>>>> _______________________________________________
>>>> Lede-dev mailing list
>>>> Lede-dev at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/lede-dev
>>>
>>> _______________________________________________
>>> Lede-dev mailing list
>>> Lede-dev at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/lede-dev
>>>
>



More information about the Lede-dev mailing list