rcu_preempt detected stalls
Jorge Ramirez-Ortiz, Foundries
jorge at foundries.io
Wed Sep 1 01:23:21 PDT 2021
On 01/09/21, Zhouyi Zhou wrote:
> Hi,
>
> I perform following two new rounds of experiments:
>
>
> Test environment (x86_64 debian10 virtual machine: kvm -cpu host -smp
> 8 -hda ./debian10.qcow2 -m 4096 -net
> user,hostfwd=tcp::5556-:22,hostfwd=tcp::5555-:19 -net nic,model=e1000
> -vnc :30)
>
> 1. CONFIG_RCU_BOOST=y
> 1.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times
> 1.2 as regular user at the same time, run $stress-ng --sequential 100
> --class scheduler -t 5m --times
>
> System begin OOM kill after 6 minutes:
> 31 19:41:12 debian kernel: [ 847.171884] task:kworker/1:0 state:D
> stack: 0 pid: 1634 ppid: 2 flag\
> s:0x00004000
> Aug 31 19:41:12 debian kernel: [ 847.171890] Workqueue: ipv6_addrconf
> addrconf_verify_work
> Aug 31 19:41:12 debian kernel: [ 847.171897] Call Trace:
> Aug 31 19:41:12 debian kernel: [ 847.171903] __schedule+0x368/0xa40
> Aug 31 19:41:12 debian kernel: [ 847.171915] schedule+0x44/0xe0
> Aug 31 19:41:12 debian kernel: [ 847.171921]
> schedule_preempt_disabled+0x14/0x20
> Aug 31 19:41:12 debian kernel: [ 847.171924] __mutex_lock+0x4b1/0xa10
> Aug 31 19:41:12 debian kernel: [ 847.171935] ? addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [ 847.171948] ? addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [ 847.171951] addrconf_verify_work+0xa/0x20
> Aug 31 19:41:12 debian kernel: [ 847.171955] process_one_work+0x1fa/0x5b0
> Aug 31 19:41:12 debian kernel: [ 847.171967] worker_thread+0x64/0x3d0
> Aug 31 19:41:12 debian kernel: [ 847.171974] ? process_one_work+0x5b0/0x5b0
> Aug 31 19:41:12 debian kernel: [ 847.171978] kthread+0x131/0x180
> Aug 31 19:41:12 debian kernel: [ 847.171982] ? set_kthread_struct+0x40/0x40
> Aug 31 19:41:12 debian kernel: [ 847.171989] ret_from_fork+0x1f/0x30
> Aug 31 19:41:12 debian kernel: [ 847.176007]
> Aug 31 19:41:12 debian kernel: [ 847.176007] Showing all locks held
> in the system:
> Aug 31 19:41:12 debian kernel: [ 847.176016] 1 lock held by khungtaskd/56:
> Aug 31 19:41:12 debian kernel: [ 847.176018] #0: ffffffff82918b60
> (rcu_read_lock){....}-{1:2}, at: debug_show_a\
> ll_locks+0xe/0x1a0
>
> 2. # CONFIG_RCU_BOOST is not set
> 2.1 as root, run #stress-ng --sequential 100 --class scheduler -t 5m --times
> 2.2 as regular user at the same time, run $stress-ng --sequential 100
> --class scheduler -t 5m --times
> System begin OOM kill after 6 minutes:
> The system is so dead, that I can't save the backtrace to file nor did
> kernel has a chance to save the log to /var/log/messages
>
all,
Thanks for testing on x86. we can also reproduce on qemu arm64. So I
think it will point out to the stress-ng test itself; I will debug it
early next week - didnt expect so much support so fast TBH, it took me
by surprise - and will report then (thanks again)
More information about the linux-arm-kernel
mailing list