[PATCH V6 0/1] nvme: allow passthru cmd error logging

Chaitanya Kulkarni chaitanyak at nvidia.com
Mon Jul 31 20:29:24 PDT 2023


On 6/28/23 18:17, Chaitanya Kulkarni wrote:
> Hi,
>
> In nvme_end_req() we only log errors which are for non-passthru
> commands. Add a helper function nvme_log_err_passthru() that allows us
> to log error for passthru commands by decoding cdw10-cdw15 values of
> nvme command.
>
> Below is short testlog :-
>
> * Admin Passsthru error log off, no errors are printed
> * Admin Passsthru error log on, errors are printed
> * IO Passsthru error log off, errors are printed
> * IO Passsthru error log on, errors are printed
>
> -ck

Sagi, Christoph, Keith,

Can you guys please have a look into this ?

-ck

> Chaitanya Kulkarni (1):
>    nvme: allow passthru cmd error logging
>
> v6:
> - Reabse, etest nvme-6.5 and add test log for admin and I/O
>    passthru error log.
>
> v5:
> - Trim down code in the nvme_log_error_passthrough().
>    Use following to get the disk name as an arg to
>     pr_err_ratelimited() :-
> 	ns ? ns->disk->disk_name : dev_name(nr->ctrl->device),
>    Use following to get the admin vs I/O opcode string as an arg to
>    pr_err_ratelimited() :-
>         	ns ? nvme_get_opcode_str(nr->cmd->common.opcode) :
>         	     nvme_get_admin_opcode_str(nr->cmd->common.opcode),
> - Rename nvme_log_error_passthrough() -> nvme_log_err_passthru().
> - Remove else and return directly in nvme_passthru_err_log_show().
> - Generate error on invalid values of the passthru_enable variable
>    in nvme_passthru_log_store().
> - Rename passthrough -> passthru.
> - Rename sysfs attr from passthru_admin_err_logging -> passthru_log_err.
>
> v4:
> - Change sysfs attribute to passthru_admin_err_logging
> - Only log passthrough admin commands.  IO passthrough commands will
>    always be logged.
>
> v3:
> - Log a passthrough specific message that dumps CDW* contents.
> - Enable/disable vis sysfs rather than debugfs.
>
> v2:
> - Included Pankaj Raghav's patch 'nvme: ignore starting sector while
>    error logging for passthrough requests'
>    with a couple changes.
> - Moved error_logging flag to nvme_ctrl structure
> - The entire nvme-debugfs.c does not need to be guarded by
>    #ifdef CONFIG_FAULT_INJECTION_DEBUG_FS.
> - Use IS_ENABLED((CONFIG_NVME_ERROR_LOGGING_DEBUG_FS)) to determine if
>    error logging should be initialized.
> - Various other nits.
>
>   drivers/nvme/host/core.c  | 43 ++++++++++++++++++++++++++++++++++-----
>   drivers/nvme/host/nvme.h  |  1 +
>   drivers/nvme/host/sysfs.c | 39 +++++++++++++++++++++++++++++++++++
>   3 files changed, 78 insertions(+), 5 deletions(-)
>
> * Admin Passsthru error log off, no errors are printed :-
>
> nvme (nvme-6.5) #
> nvme (nvme-6.5) # echo 0 > /sys/class/nvme/nvme0/passthru_err_log
> nvme (nvme-6.5) # nvme telemetry-log -o /tmp/test /dev/nvme0
> NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
> Failed to acquire telemetry log 16386!
> nvme (nvme-6.5) # cd -
> /sys/kernel/debug/tracing/events/nvme
> nvme # cd -
> /mnt/data/nvme
> nvme (nvme-6.5) # dmesg  -c
>
> * Admin Passsthru error log on, errors are printed :-
>
> nvme (nvme-6.5) # echo 1 > /sys/class/nvme/nvme0/passthru_err_log
> nvme (nvme-6.5) # nvme telemetry-log -o /tmp/test /dev/nvme0
> NVMe status: Invalid Field in Command: A reserved coded value or an unsupported value in a defined field(0x4002)
> Failed to acquire telemetry log 16386!
> nvme (nvme-6.5) # dmesg  -c
> [  860.013105] nvme0: Get Log Page(0x2), Invalid Field in Command (sct 0x0 / sc 0x2) DNR cdw10=0x7f0107 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
> nvme (nvme-6.5) #
> nvme (nvme-6.5) #
>
> * IO Passsthru error log off, errors are printed :-
>
> nvme (nvme-6.5) #  echo 0 > /sys/class/nvme/nvme0/passthru_err_log
> nvme (nvme-6.5) # nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
> NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
> nvme (nvme-6.5) # dmesg -c
> [73675.769162] nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
> [73675.769233] nvme0n1: Write Zeroes(0x8), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x200000 cdw11=0x0 cdw12=0xa cdw13=0x0 cdw14=0x0 cdw15=0x0
>
> * IO Passsthru error log on, errors are printed :-
>
> nvme (nvme-6.5) #  echo 1 > /sys/class/nvme/nvme0/passthru_err_log
> nvme (nvme-6.5) # nvme write-zeroes -n 1 -s 0x200000 -c 10 /dev/nvme0
> NVMe status: LBA Out of Range: The command references an LBA that exceeds the size of the namespace(0x4080)
> nvme (nvme-6.5) # dmesg -c
> [73684.208241] nvme nvme0: using deprecated NVME_IOCTL_IO_CMD ioctl on the char device!
> [73684.208419] nvme0n1: Write Zeroes(0x8), LBA Out of Range (sct 0x0 / sc 0x80) DNR cdw10=0x200000 cdw11=0x0 cdw12=0xa cdw13=0x0 cdw14=0x0 cdw15=0x0
> nvme (nvme-6.5) #
>



More information about the Linux-nvme mailing list