[PATCH for-next 4/4] nvme-multipath: add multipathing for uring-passthrough commands
Sagi Grimberg
sagi at grimberg.me
Wed Jul 13 05:43:39 PDT 2022
On 7/13/22 14:49, Hannes Reinecke wrote:
> On 7/13/22 13:00, Sagi Grimberg wrote:
>>
>>>> Maybe the solution is to just not expose a /dev/ng for the mpath device
>>>> node, but only for bottom namespaces. Then it would be completely
>>>> equivalent to scsi-generic devices.
>>>>
>>>> It just creates an unexpected mix of semantics of best-effort
>>>> multipathing with just path selection, but no requeue/failover...
>>>
>>> Which is exactly the same semanics as SG_IO on the dm-mpath nodes.
>>
>> I view uring passthru somewhat as a different thing than sending SG_IO
>> ioctls to dm-mpath. But it can be argued otherwise.
>>
>> BTW, the only consumer of it that I'm aware of commented that he
>> expects dm-mpath to retry SG_IO when dm-mpath retry for SG_IO submission
>> was attempted (https://www.spinics.net/lists/dm-devel/msg46924.html).
>>
>> From Paolo:
>> "The problem is that userspace does not have a way to direct the
>> command to a different path in the resubmission. It may not even have
>> permission to issue DM_TABLE_STATUS, or to access the /dev nodes for
>> the underlying paths, so without Martin's patches SG_IO on dm-mpath is
>> basically unreliable by design."
>>
>> I didn't manage to track down any followup after that email though...
>>
> I did; 'twas me who was involved in the initial customer issue leading
> up to that.
>
> Amongst all the other issue we've found the prime problem with SG_IO is
> that it needs to be directed to the 'active' path.
> For the device-mapper has a distinct callout (dm_prepare_ioctl), which
> essentially returns the current active path device. And then the
> device-mapper core issues the command on that active path.
>
> All nice and good, _unless_ that command triggers an error.
> Normally it'd be intercepted by the dm-multipath end_io handler, and
> would set the path to offline.
> But as ioctls do not use the normal I/O path the end_io handler is never
> called, and further SG_IO calls are happily routed down the failed path.
>
> And the customer had to use SG_IO (or, in qemu-speak, LUN passthrough)
> as his application/filesystem makes heavy use of persistent reservations.
How did this conclude Hannes?
More information about the Linux-nvme
mailing list