[RFC PATCH 1/2] nvme-multipath: introduce delayed removal of the multipath head node

Nilay Shroff nilay at linux.ibm.com
Tue Apr 8 07:07:48 PDT 2025



On 4/7/25 8:14 PM, Christoph Hellwig wrote:
>> @@ -3690,6 +3690,10 @@ static struct nvme_ns_head *nvme_alloc_ns_head(struct nvme_ctrl *ctrl,
>>  	ratelimit_state_init(&head->rs_nuse, 5 * HZ, 1);
>>  	ratelimit_set_flags(&head->rs_nuse, RATELIMIT_MSG_ON_RELEASE);
>>  	kref_init(&head->ref);
>> +#ifdef CONFIG_NVME_MULTIPATH
>> +	if (ctrl->ops->flags & NVME_F_FABRICS)
>> +		set_bit(NVME_NSHEAD_FABRICS, &head->flags);
>> +#endif
> 
> We might want to make the flags unconditional or move this into a helper
> to avoid the ifdef'ery if we keep the flag (see below).
Yes okay, this could be moved into a helper so that we can avoid ifdef'ery.

> 
>> -	if (last_path)
>> -		nvme_mpath_shutdown_disk(ns->head);
>> +	nvme_mpath_shutdown_disk(ns->head);
> 
> I guess this function is where the shutdown naming came from, and it
> probably was a bad idea even back then..
> 
> Maybe throw in an extra patch to rename it as well.
> 
Sure will do.

>> +	/*
>> +	 * For non-fabric controllers we support delayed removal of head disk
>> +	 * node. If we reached up to here then it means that head disk is still
>> +	 * alive and so we assume here that even if there's no path available
>> +	 * maybe due to the transient link failure, we could queue up the IO
>> +	 * and later when path becomes ready we re-submit queued IO.
>> +	 */
>> +	if (!(test_bit(NVME_NSHEAD_FABRICS, &head->flags)))
>> +		return true;
> 
> Why is this conditional on fabrics or not?  The same rationale should
> apply as much if not more for fabrics controllers.
> 
For fabrics we already have options like "reconnect_delay" and 
"max_reconnects". So in case of fabric link failures, we delay 
the removal of the head disk node based on those options.

> Also no need for the set of braces around the test_bit() call.
> 
Yeah agreed!
>>  }
>>  
>> +static void nvme_remove_head(struct nvme_ns_head *head)
>> +{
>> +	if (test_and_clear_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) {
>> +		/*
>> +		 * requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared
>> +		 * to allow multipath to fail all I/O.
>> +		 */
> 
> Full sentence are supposed to start with a capitalized word.
> 
> (yes, I saw this just move, but it's a good chance to fix it)
> 
Yes, I will fix it in the next patch.

Thanks,
--Nilay





More information about the Linux-nvme mailing list