[PATCH 1/2] NVMe: Make surprise removal work again

Keith Busch keith.busch at intel.com
Mon Feb 1 17:17:52 PST 2016


On Mon, Feb 01, 2016 at 07:27:23AM -0800, Busch, Keith wrote:
> the direction I was given was to move the request ending to the block layer
> when we kill it, so this won't be necessary in the next revision (will be
> sent out today).

After merging and moving io ending to the block layer, something appears
broken. Not sure what's going on, so just posting here in case there's
better ideas.

The test runs buffered writes to an nvme drive (ex: dd if=/dev/zero
of=/dev/nvme0n1 bs=16M), then yank the drive when that ramps up.

Device removal completes after a few seconds, and /dev/nvme0n1 is no
longer present. However, the 'dd' task never completes, with kernel
stack trace:

[<ffffffff81093fb0>] __mod_timer+0xd4/0xe6
[<ffffffff810939bf>] process_timeout+0x0/0xc
[<ffffffff810fcad7>] balance_dirty_pages_ratelimited+0x8b1/0xa05
[<ffffffff8116a2a3>] __set_page_dirty.constprop.61+0x81/0x9f
[<ffffffff810f3457>] generic_perform_write+0x15a/0x1d1
[<ffffffff81157095>] generic_update_time+0x9f/0xaa
[<ffffffff810f4965>] __generic_file_write_iter+0xea/0x146
[<ffffffff8116d2a3>] blkdev_write_iter+0x78/0xf5
[<ffffffff811429b4>] __vfs_write+0x83/0xab
[<ffffffff81143cda>] vfs_write+0x87/0xdd
[<ffffffff81143ed7>] SyS_write+0x56/0x8a
[<ffffffff81472657>] entry_SYSCALL_64_fastpath+0x12/0x6a
[<ffffffffffffffff>] 0xffffffffffffffff

If driver ends all IO's it knows about and blk_cleanup_queue returns,
then the driver did it's part as far as I know. Not sure how to get this
to end with the expected IO error, but I'm pretty sure this used to work.



More information about the Linux-nvme mailing list