[PATCHv3] nvme: generate uevent once a multipath namespace is operational again

Sagi Grimberg sagi at grimberg.me
Mon May 17 10:49:19 PDT 2021


> When fast_io_fail_tmo is set I/O will be aborted while recovery is
> still ongoing. This causes MD to set the namespace to failed, and
> no futher I/O will be submitted to that namespace.
> 
> However, once the recovery succeeds and the namespace becomes
> operational again the NVMe subsystem doesn't send a notification,
> so MD cannot automatically reinstate operation and requires
> manual interaction.
> 
> This patch will send a KOBJ_CHANGE uevent per multipathed namespace
> once the underlying controller transitions to LIVE, allowing an automatic
> MD reassembly with these udev rules:
> 
> /etc/udev/rules.d/65-md-auto-re-add.rules:
> SUBSYSTEM!="block", GOTO="md_end"
> 
> ACTION!="change", GOTO="md_end"
> ENV{ID_FS_TYPE}!="linux_raid_member", GOTO="md_end"
> PROGRAM="/sbin/md_raid_auto_readd.sh $devnode"
> LABEL="md_end"
> 
> /sbin/md_raid_auto_readd.sh:
> 
> MDADM=/sbin/mdadm
> DEVNAME=$1
> 
> export $(${MDADM} --examine --export ${DEVNAME})
> 
> if [ -z "${MD_UUID}" ]; then
>      exit 1
> fi
> 
> UUID_LINK=$(readlink /dev/disk/by-id/md-uuid-${MD_UUID})
> MD_DEVNAME=${UUID_LINK##*/}
> export $(${MDADM} --detail --export /dev/${MD_DEVNAME})
> if [ -z "${MD_METADATA}" ] ; then
>      exit 1
> fi
> if [ $(cat /sys/block/${MD_DEVNAME}/md/degraded) != 1 ]; then
>      echo "${MD_DEVNAME}: array not degraded, nothing to do"
>      exit 0
> fi
> MD_STATE=$(cat /sys/block/${MD_DEVNAME}/md/array_state)
> if [ ${MD_STATE} != "clean" ] ; then
>      echo "${MD_DEVNAME}: array state ${MD_STATE}, cannot re-add"
>      exit 1
> fi
> MD_VARNAME="MD_DEVICE_dev_${DEVNAME##*/}_ROLE"
> if [ ${!MD_VARNAME} = "spare" ] ; then
>      ${MDADM} --manage /dev/${MD_DEVNAME} --re-add ${DEVNAME}
> fi

Is this auto-readd stuff going to util-linux?

> 
> Changes to v2:
> - Add udev rules example to description
> Changes to v1:
> - use disk_uevent() as suggested by hch

This belongs after the '---' separator..

> 
> Signed-off-by: Hannes Reinecke <hare at suse.de>
> ---
>   drivers/nvme/host/multipath.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 0551796517e6..ecc99bd5f8ad 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -100,8 +100,11 @@ void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
>   
>   	down_read(&ctrl->namespaces_rwsem);
>   	list_for_each_entry(ns, &ctrl->namespaces, list) {
> -		if (ns->head->disk)
> -			kblockd_schedule_work(&ns->head->requeue_work);
> +		if (!ns->head->disk)
> +			continue;
> +		kblockd_schedule_work(&ns->head->requeue_work);
> +		if (ctrl->state == NVME_CTRL_LIVE)
> +			disk_uevent(ns->head->disk, KOBJ_CHANGE);
>   	}

I asked this on v1, is this only needed for mpath devices?



More information about the Linux-nvme mailing list