[PATCH v2] tests/nvme: Add admin-passthru+reset race test

Keith Busch kbusch at kernel.org
Mon Nov 21 12:55:12 PST 2022


On Thu, Nov 17, 2022 at 02:22:10PM -0700, Jonathan Derrick wrote:
> I seem to have isolated the error mechanism for older kernels, but 6.2.0-rc2
> reliably segfaults my QEMU instance (something else to look into) and I don't
> have any 'real' hardware to test this on at the moment. It looks like several
> passthru commands are able to enqueue prior/during/after resetting/connecting.

I'm not seeing any problem with the latest nvme-qemu after several dozen
iterations of this test case. In that environment, the formats and
resets complete practically synchronously with the call, so everything
proceeds quickly. Is there anything special I need to change?
 
> The issue seems to be very heavily timing related, so the loop in the header is
> a lot more forceful in this approach.
> 
> As far as the loop goes, I've noticed it will typically repro immediately or
> pass the whole test.

I can only get possible repro in scenarios that have multi-second long,
serialized format times. Even then, it still appears that everything
fixes itself after a waiting. Are you observing the same, or is it stuck
forever in your observations?

> +remove_and_rescan() {
> +	local pdev=$1
> +	echo 1 > /sys/bus/pci/devices/"$pdev"/remove
> +	echo 1 > /sys/bus/pci/rescan
> +}

This function isn't called anywhere.



More information about the Linux-nvme mailing list