rcu_preempt detected stalls

Jorge Ramirez-Ortiz, Foundries jorge at foundries.io
Wed Sep 1 01:23:21 PDT 2021


On 01/09/21, Zhouyi Zhou wrote:
> Hi,
> 
> I  perform following two new rounds of experiments:
> 
> 
> Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp
> 8 -hda ./debian10.qcow2 -m 4096 -net
> user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000
> -vnc :30)
> 
> 1.   CONFIG_RCU_BOOST=y
> 1.1 as root, run #stress-ng --sequential 100  --class scheduler -t 5m --times
> 1.2 as regular user at the same time, run $stress-ng --sequential 100
> --class scheduler -t 5m --times
> 
> System begin OOM kill after 6 minutes:
> 31 19:41:12 debian kernel: [  847.171884] task:kworker/1:0     state:D
> stack:    0 pid: 1634 ppid:     2 flag\
> s:0x00004000
> Aug 31 19:41:12 debian kernel: [  847.171890] Workqueue: ipv6_addrconf
> addrconf_verify_work
> Aug 31 19:41:12 debian kernel: [  847.171897] Call Trace:
> Aug 31 19:41:12 debian kernel: [  847.171903]  __schedule+0x368/0xa40
> Aug 31 19:41:12 debian kernel: [  847.171915]  schedule+0x44/0xe0
> Aug 31 19:41:12 debian kernel: [  847.171921]
> schedule_preempt_disabled+0x14/0x20
> Aug 31 19:41:12 debian kernel: [  847.171924]  __mutex_lock+0x4b1/0xa10
> Aug 31 19:41:12 debian kernel: [  847.171935]  ? addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [  847.171948]  ? addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [  847.171951]  addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [  847.171955]  process_one_work+0x1fa/0x5b0
> Aug 31 19:41:12 debian kernel: [  847.171967]  worker_thread+0x64/0x3d0
> Aug 31 19:41:12 debian kernel: [  847.171974]  ? process_one_work+0x5b0/0x5b0
> Aug 31 19:41:12 debian kernel: [  847.171978]  kthread+0x131/0x180
> Aug 31 19:41:12 debian kernel: [  847.171982]  ? set_kthread_struct+0x40/0x40
> Aug 31 19:41:12 debian kernel: [  847.171989]  ret_from_fork+0x1f/0x30
> Aug 31 19:41:12 debian kernel: [  847.176007]
> Aug 31 19:41:12 debian kernel: [  847.176007] Showing all locks held
> in the system:
> Aug 31 19:41:12 debian kernel: [  847.176016] 1 lock held by khungtaskd/56:
> Aug 31 19:41:12 debian kernel: [  847.176018]  #0: ffffffff82918b60
> (rcu_read_lock){....}-{1:2}, at: debug_show_a\
> ll_locks+0xe/0x1a0
> 
> 2.  # CONFIG_RCU_BOOST is not set
> 2.1 as root, run #stress-ng --sequential 100  --class scheduler -t 5m --times
> 2.2 as regular user at the same time, run $stress-ng --sequential 100
> --class scheduler -t 5m --times
> System begin OOM kill after 6 minutes:
> The system is so dead, that I can't save the backtrace to file nor did
> kernel has a chance to save the log to /var/log/messages
> 

all, 

Thanks for testing on x86. we can also reproduce on qemu arm64. So I
think it will point out to the stress-ng test itself; I will debug it
early next week - didnt expect so much support so fast TBH, it took me
by surprise - and will report then (thanks again)







More information about the linux-arm-kernel mailing list