nvme-pci timeout issue

Chaitanya Kulkarni chaitanyak at nvidia.com
Wed Jan 24 02:05:08 PST 2024


On 1/24/2024 1:52 AM, Christoph Hellwig wrote:
> On Tue, Jan 23, 2024 at 10:19:28PM +0000, Chaitanya Kulkarni wrote:
>> Can someone please provide an insight on this behavior so we can merge
>> testcase into blktests? Please note that Shinichiro also observed the
>> same behavior.
> 
> Can you provide the test case?
> 


Here is latest version of blktests patch for the testcase :-

 From d3ab5bb3d22fcc3593e4da7599523e013239720e Mon Sep 17 00:00:00 2001
From: Chaitanya Kulkarni <kch at nvidia.com>
Date: Tue, 23 Jan 2024 14:30:28 -0800
Subject: [PATCH blktests V3] nvme: add nvme pci timeout testcase

Trigger and test nvme-pci timeout with concurrent fio jobs.

Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
---
V3:-

1. Add CAN_BE_ZONED.
2. Add FAULT_INJECTION_DEBUG_FS check in requires.
3. Remove _require_nvme_trtype pci in requires().
4. Remove device_requires().
5. Store fio output in FULL.
6. Revmoe shellcheck and use grep I/O error value to pass/fail testcase.

---
  tests/nvme/050     | 69 ++++++++++++++++++++++++++++++++++++++++++++++
  tests/nvme/050.out |  2 ++
  2 files changed, 71 insertions(+)
  create mode 100755 tests/nvme/050
  create mode 100644 tests/nvme/050.out

diff --git a/tests/nvme/050 b/tests/nvme/050
new file mode 100755
index 0000000..cacaba6
--- /dev/null
+++ b/tests/nvme/050
@@ -0,0 +1,69 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024 Chaitanya Kulkarni
+#
+# Test NVMe-PCI timeout with FIO jobs by triggering the nvme_timeout 
function.
+#
+
+. tests/nvme/rc
+
+DESCRIPTION="test nvme-pci timeout with fio jobs"
+CAN_BE_ZONED=1
+
+sysfs_path="/sys/kernel/debug/fail_io_timeout/"
+#restrict test to nvme-pci only
+nvme_trtype=pci
+
+# fault injection config array
+declare -A fi_array
+
+requires() {
+	_have_fio
+	_nvme_requires
+	_have_kernel_option FAIL_IO_TIMEOUT
+	_have_kernel_option FAULT_INJECTION_DEBUG_FS
+}
+
+save_fi_settings() {
+	for fi_attr in probability interval times space verbose
+	do
+		fi_array["${fi_attr}"]=$(cat "${sysfs_path}/${fi_attr}")
+	done
+}
+
+restore_fi_settings() {
+	for fi_attr in probability interval times space verbose
+	do
+		echo "${fi_array["${fi_attr}"]}" > "${sysfs_path}/${fi_attr}"
+	done
+}
+
+test_device() {
+	local nvme_ns
+	local io_fimeout_fail
+
+	echo "Running ${TEST_NAME}"
+
+	nvme_ns="$(basename "${TEST_DEV}")"
+	io_fimeout_fail="$(cat /sys/block/"${nvme_ns}"/io-timeout-fail)"
+	save_fi_settings
+	echo 1 > /sys/block/"${nvme_ns}"/io-timeout-fail
+
+	echo 100 > /sys/kernel/debug/fail_io_timeout/probability
+	echo   1 > /sys/kernel/debug/fail_io_timeout/interval
+	echo  -1 > /sys/kernel/debug/fail_io_timeout/times
+	echo   0 > /sys/kernel/debug/fail_io_timeout/space
+	echo   1 > /sys/kernel/debug/fail_io_timeout/verbose
+
+	fio --bs=4k --rw=randread --norandommap --numjobs="$(nproc)" \
+	    --name=reads --direct=1 --filename="${TEST_DEV}" --group_reporting \
+	    --time_based --runtime=1m >& "$FULL"
+
+	if grep -q "Input/output error" "$FULL"; then
+		echo "Test complete"
+	else
+		echo "Test failed"
+	fi
+	restore_fi_settings
+	echo "${io_fimeout_fail}" > /sys/block/"${nvme_ns}"/io-timeout-fail
+}
diff --git a/tests/nvme/050.out b/tests/nvme/050.out
new file mode 100644
index 0000000..b78b05f
--- /dev/null
+++ b/tests/nvme/050.out
@@ -0,0 +1,2 @@
+Running nvme/050
+Test complete
-- 
2.40.0



It is also posted here on linux-nvme just in case :-

https://lists.infradead.org/pipermail/linux-nvme/2024-January/044562.html

-ck



More information about the Linux-nvme mailing list