[PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.

Tue Aug 9 12:33:00 PDT 2022

On 8/9/22 11:08, Mike Christie wrote:
> On 8/9/22 2:21 AM, Christoph Hellwig wrote:
>> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>>> callouts. The lower levels will convert their device specific error to
>>> the blk_status_t then the upper levels can easily check that code
>>> without knowing the device type. It also allows us to keep userspace
>>> compat where it expects a negative -Exyz error code if the command fails
>>> before it's sent to the device or a device/tranport specific value if the
>>> error is > 0.
>>
>> Why do we need two return values here?
> 
> I know the 2 return values are gross :) I can do it in one, but I wasn't sure
> what's worse. See below for the other possible solutions. I think they are all
> bad.
> 
> 
> 0. Convert device specific conflict error to -EBADE then back:
> 
> sd_pr_command()
> 
> .....
> 
> /* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
> if (result == SAM_STAT_CHECK_CONDITION)
> 	return -EBADE;
> else
> 	return result;
> 
> 
> LIO then just checks for -EBADE but when going to userspace we have to
> convert:
> 
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -EBADE) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> 
> The conversion is kind of gross and I was thinking in the future it's going
> to get worse. I'm going to want to have more advanced error handling in LIO
> and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
> of a path failure, so it can retry another one, or a hard device/target error.
> It would be nice for LIO if an PGR had bad/illegal values and the device
> returned an error than I could detect that.
> 
> 
> 1. Drop the -Exyz error type and use blk_status_t in the kernel:
> 
> sd_pr_command()
> 
> .....
> if (result < 0)
> 	return -errno_to_blk_status(result);
> else if (result == SAM_STAT_CHECK_CONDITION)
> 	return -BLK_STS_NEXUS;
> else
> 	return result;
> 
> blkdev_pr_register()
> 
> ...
> 	result = ops->pr_register()
> 	if (result < 0) {
> 		/* For compat we must convert back to the nvme/scsi code */
> 		if (result == -BLK_STS_NEXUS) {
> 			/* need some helper for this that calls down the stack */
> 			if (bdev == SCSI)
> 				return SAM_STAT_RESERVATION_CONFLICT
> 			else
> 				return NVME_SC_RESERVATION_CONFLICT
> 		} else
> 			return blk_status_to_str(result)
> 	} else
> 		return result;
> 
> This has similar issues as #0 where we have to convert before returning to
> userspace.
> 
> 
> Note: In this case, if the block layer uses an -Exyz error code there's not
> BLK_STS for then we would return -EIO to userspace now. I was thinking
> that might not be ok but I could also just add a BLK_STS error code
> for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.
> 
> 
> 2. We could do something like below where the low levels are not changed but the
> caller converts:
> 
> sd_pr_command()
> 	/* no changes */
> 
> lio()
> 	result = ops->pr_register()
> 	if (result > 0) {
> 		/* add some stacked helper again that goes through dm and
> 		 * to the low level device
> 		 */
> 		if (bdev == SCSI) {
> 			result = scsi_result_to_blk_status(result)
> 		else
> 			result = nvme_error_status(result)
> 
> 
> This looks simple, but it felt wrong having upper layers having to
> know the device type and calling conversion functions.

Has it been considered to introduce a new enumeration type instead of 
choosing (0), (1) or (2)?

Thanks,

Bart.