[PATCH 09/24] scsi-multipath: failover handling

John Garry john.g.garry at oracle.com
Mon Mar 2 04:20:01 PST 2026


On 02/03/2026 03:57, Benjamin Marzinski wrote:
> n Wed, Feb 25, 2026 at 03:36:12PM +0000, John Garry wrote:
>> For a scmd which suffers failover, requeue the master bio of each bio
>> attached to its request.
>>
>> A handler is added in the scsi_driver structure to lookup a
>> mpath_disk from a request. This is needed because the scsi_disk structure
>> will manage the mpath_disk, and the code core has no method to look this
>> up from the scsi_scmnd.
>>
>> Failover occurs when the scsi_cmnd has failed and it is discovered that the
>> original scsi_device has transport down.
>>
>> Signed-off-by: John Garry<john.g.garry at oracle.com>
>> ---
>>   drivers/scsi/scsi_error.c     | 12 ++++++
>>   drivers/scsi/scsi_lib.c       |  9 +++-
>>   drivers/scsi/scsi_multipath.c | 80 +++++++++++++++++++++++++++++++++++
>>   include/scsi/scsi.h           |  1 +
>>   include/scsi/scsi_driver.h    |  3 ++
>>   include/scsi/scsi_multipath.h | 14 ++++++
>>   6 files changed, 118 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index f869108fd9693..0fd1b46764c3f 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -40,6 +40,7 @@
>>   #include <scsi/scsi_ioctl.h>
>>   #include <scsi/scsi_dh.h>
>>   #include <scsi/scsi_devinfo.h>
>> +#include <scsi/scsi_multipath.h>
>>   #include <scsi/sg.h>
>>   
>>   #include "scsi_priv.h"
>> @@ -1901,12 +1902,16 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd)
>>   enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *scmd)
>>   {
> I'm no scsi expert, but the checks in scsi_decide_disposition() don't
> look quite right to me. You only failover in cases were it the code
> might want to retry (and when the device is offline), but there are
> other cases, when you don't want to retry the same path, where it would
> make sense to try another path, DID_TRANSPORT_FAILFAST for example.

Yeah, I need to check handling of FAILFAST-types more closely.

> 
> As a more general issue, since you are already cloning the bios,
> couldn't you just trigger the failovers in scsi_mpath_clone_end_io().
> blk_path_error() can tell you which bios should be retried. That seems
> like a much simpler approach than trying to intercept the request before
> it completes the clones, and I don't see what you're losing by letting
> the clones complete, and dealing requeueing there (again, I'm no scsi
> expert).

Sure, I'll consider it further. But I am thinking that it's nicer to 
keep the error handling centralized and not have FAILOVER specifics in 
the cloning code.

> 
>>   	enum scsi_disposition rtn;
>> +	struct request *req = scsi_cmd_to_rq(scmd);
>>   
>>   	/*
>>   	 * if the device is offline, then we clearly just pass the result back
>>   	 * up to the top level.
>>   	 */
>>   	if (!scsi_device_online(scmd->device)) {
>> +		if (scsi_is_mpath_request(req))
>> +			return scsi_mpath_failover_disposition(scmd);
>> +
>>   		SCSI_LOG_ERROR_RECOVERY(5, scmd_printk(KERN_INFO, scmd,
>>   			"%s: device offline - report as SUCCESS\n", __func__));
>>   		return SUCCESS;
>> diff --git a/drivers/scsi/scsi_multipath.c b/drivers/scsi/scsi_multipath.c
>> index c3e0f792e921f..16b1f84fc552c 100644
>> --- a/drivers/scsi/scsi_multipath.c
>> +++ b/drivers/scsi/scsi_multipath.c
>> @@ -518,6 +518,86 @@ void scsi_mpath_put_head(struct scsi_mpath_head *scsi_mpath_head)
>>   }
>>   EXPORT_SYMBOL_GPL(scsi_mpath_put_head);
>>   
>> +bool scsi_is_mpath_request(struct request *req)
>> +{
>> +	return is_mpath_request(req);
>> +}
>> +EXPORT_SYMBOL_GPL(scsi_is_mpath_request);
>> +
>> +static inline void bio_list_add_clone_master(struct bio_list *bl,
>> +				struct bio *clone)
>> +{
>> +	struct scsi_mpath_clone_bio *scsi_mpath_clone_bio;
>> +	struct bio *master_bio;
>> +
>> +	if (clone->bi_next)
>> +		bio_list_add_clone_master(bl, clone->bi_next);
> This is a pretty minimal function, but a request could have a lot of
> merged bios, so you probably want to hande them iteratively, instead of
> recursing into the function, and eating up stack space unnecessarily.
> Also, this reverses the bio's order when adding them to the requeue
> list.

Yeah, I think that this function can be improved. Self-calling functions 
are generally nasty.

Thanks!



More information about the Linux-nvme mailing list