Resets during user commands leads to hung task and controller stuck in connecting
Jonathan Derrick
jonathan.derrick at linux.dev
Fri Nov 11 13:50:33 PST 2022
Hi,
I'm (again) seeing a hung task when doing resets and formats simultaneously.
Controller state is left in 'connecting'
Using nvme.git/nvme-6.2 as of 'nvme: implement the DEAC bit for the Write Zeroes command',
but I have also repro'd with Christoph's latest reset/probe-split set
ctrl="nvme0"
nsid=1
pci="/sys/block/${ctrl}n${nsid}/device/"
echo 30 > /proc/sys/kernel/hung_task_timeout_secs
while true; do
nvme format -f /dev/${ctrl}n${nsid} &
echo 1 > $pci/reset_controller &
done
[ 79.195862] nvme nvme0: Ignoring bogus Namespace Identifiers
[ 122.378580] INFO: task sh:7737 blocked for more than 30 seconds.
[ 122.380329] Not tainted 6.1.0-rc2+ #87
[ 122.381594] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 122.383782] task:sh state:D stack:0 pid:7737 ppid:1 flags:0x00000004
[ 122.386078] Call Trace:
[ 122.386909] <TASK>
[ 122.387659] __schedule+0x320/0xb10
[ 122.388772] ? lock_release+0x22b/0x450
[ 122.389920] ? lock_acquired+0x1a2/0x400
[ 122.391094] ? wait_for_completion+0x83/0x160
[ 122.392358] schedule+0x53/0xd0
[ 122.393337] schedule_timeout+0x310/0x3b0
[ 122.394517] ? rcu_read_lock_held_common+0xe/0x50
[ 122.395847] ? rcu_read_lock_sched_held+0x23/0x80
[ 122.397189] ? lock_release+0x22b/0x450
[ 122.398336] ? lock_acquired+0x1a2/0x400
[ 122.399484] ? wait_for_completion+0x83/0x160
[ 122.400763] wait_for_completion+0xb5/0x160
[ 122.401968] __flush_work+0x293/0x4a0
[ 122.403068] ? flush_workqueue_prep_pwqs+0x120/0x120
[ 122.404463] ? rcu_read_lock_sched_held+0x23/0x80
[ 122.405791] ? trace_hardirqs_on+0x2b/0xd0
[ 122.406977] nvme_reset_ctrl_sync+0x2a/0x40 [nvme_core]
[ 122.408473] nvme_sysfs_reset+0x12/0x30 [nvme_core]
[ 122.409856] kernfs_fop_write_iter+0x142/0x1e0
[ 122.411137] vfs_write+0x357/0x4f0
[ 122.412161] ksys_write+0x5f/0xe0
[ 122.413172] do_syscall_64+0x3a/0x90
[ 122.414238] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 122.415646] RIP: 0033:0x7f7980ced648
[ 122.416710] RSP: 002b:00007ffc01ee6778 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 122.418794] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f7980ced648
[ 122.420673] RDX: 0000000000000002 RSI: 0000563b40d93ca0 RDI: 0000000000000001
[ 122.422550] RBP: 0000563b40d93ca0 R08: 000000000000000a R09: 00007f7980d3cda0
[ 122.424385] R10: 000000000000000a R11: 0000000000000246 R12: 00007f7980fc06e0
[ 122.426232] R13: 0000000000000002 R14: 00007f7980fbb880 R15: 0000000000000002
[ 122.428067] </TASK>
[ 122.428821] INFO: lockdep is turned off.
More information about the Linux-nvme
mailing list