[PATCH v2 4/6] nvme-rdma: avoid IO error and repeated request completion

Sagi Grimberg sagi at grimberg.me
Wed Jan 20 16:35:47 EST 2021


is not something we should be handling in nvme. block drivers
>>>> should be able to fail queue_rq, and this all should live in the
>>>> block layer.
>>> Of course, it is also an idea to repair the block drivers directly.
>>> However, block layer is unaware of nvme native multipathing,
>>
>> Nor it should be
>>
>>> will cause the request return error which should be avoided.
>>
>> Not sure I understand..
>> requests should failover for path related errors,
>> what queue_rq errors are expected to be failed over from your
>> perspective?
> Although fail over for only path related errors is the best choice, it's
> almost impossible to achieve.
> The probability of non-path-related errors is very low. Although these
> errors do not require fail over retry, the cost of fail over retry
> is complete the request with error delay a bit long time(retry several
> times). It's not the best choice, but I think it's acceptable, because
> HBA driver does not have path-related error codes but only general error
> codes. It is difficult to identify whether the general error codes are
> path-related.

If we have a SW bug or breakage that can happen occasionally, this can
result in a constant failover rather than a simple failure. This is just
not a good approach IMO.

>>> The scenario: use two HBAs for nvme native multipath, and then one HBA
>>> fault,
>>
>> What is the specific error the driver sees?
> The path related error code is closely related to HBA driver
> implementation. In general it is EIO. I don't think it's a good idea to
> assume what general error code the driver returns in the event of a path
> error.

But assuming every error is a path error a good idea?

>>> the blk_status_t of queue_rq is BLK_STS_IOERR, blk-mq will call
>>> blk_mq_end_request to complete the request which bypass name native
>>> multipath. We expect the request fail over to normal HBA, but the 
>>> request
>>> is directly completed with BLK_STS_IOERR.
>>> The two scenarios can be fixed by directly completing the request in 
>>> queue_rq.
>> Well, certainly this one-shot always return 0 and complete the command
>> with HOST_PATH error is not a good approach IMO
> So what's the better option? Just complete the request with host path
> error for non-ENOMEM and EAGAIN returned by the HBA driver?

Well, the correct thing to do here would be to clone the bio and
failover if the end_io error status is BLK_STS_IOERR. That sucks
because it adds overhead, but this proposal doesn't sit well. it
looks wrong to me.

Alternatively, a more creative idea would be to encode the error
status somehow in the cookie returned from submit_bio, but that
also feels like a small(er) hack..



More information about the Linux-nvme mailing list