Re: [PATCH v2 12/20] block,nvme,scsi,dm: Add blk_status to pr_ops callouts.

Mike Christie <michael.christie@xxxxxxxxxx> · Tue, 9 Aug 2022 13:08:08 -0500

On 8/9/22 2:21 AM, Christoph Hellwig wrote:
> On Mon, Aug 08, 2022 at 07:04:11PM -0500, Mike Christie wrote:
>> To handle both cases, this patch adds a blk_status_t arg to the pr_ops
>> callouts. The lower levels will convert their device specific error to
>> the blk_status_t then the upper levels can easily check that code
>> without knowing the device type. It also allows us to keep userspace
>> compat where it expects a negative -Exyz error code if the command fails
>> before it's sent to the device or a device/tranport specific value if the
>> error is > 0.
> 
> Why do we need two return values here?

I know the 2 return values are gross :) I can do it in one, but I wasn't sure
what's worse. See below for the other possible solutions. I think they are all
bad.

0. Convert device specific conflict error to -EBADE then back:

sd_pr_command()

.....

/* would add similar check for NVME_SC_RESERVATION_CONFLICT in nvme */
if (result == SAM_STAT_CHECK_CONDITION)
	return -EBADE;
else
	return result;

LIO then just checks for -EBADE but when going to userspace we have to
convert:

blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -EBADE) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;

The conversion is kind of gross and I was thinking in the future it's going
to get worse. I'm going to want to have more advanced error handling in LIO
and dm-multipath. Like dm-multipath wants to know if an pr_op failed because
of a path failure, so it can retry another one, or a hard device/target error.
It would be nice for LIO if an PGR had bad/illegal values and the device
returned an error than I could detect that.

1. Drop the -Exyz error type and use blk_status_t in the kernel:

sd_pr_command()

.....
if (result < 0)
	return -errno_to_blk_status(result);
else if (result == SAM_STAT_CHECK_CONDITION)
	return -BLK_STS_NEXUS;
else
	return result;

blkdev_pr_register()

...
	result = ops->pr_register()
	if (result < 0) {
		/* For compat we must convert back to the nvme/scsi code */
		if (result == -BLK_STS_NEXUS) {
			/* need some helper for this that calls down the stack */
			if (bdev == SCSI)
				return SAM_STAT_RESERVATION_CONFLICT
			else
				return NVME_SC_RESERVATION_CONFLICT
		} else
			return blk_status_to_str(result)
	} else
		return result;

This has similar issues as #0 where we have to convert before returning to
userspace.

Note: In this case, if the block layer uses an -Exyz error code there's not
BLK_STS for then we would return -EIO to userspace now. I was thinking
that might not be ok but I could also just add a BLK_STS error code
for errors like EINVAL, EWOULDBLOCK, ENOMEM, etc so that doesn't happen.

2. We could do something like below where the low levels are not changed but the
caller converts:

sd_pr_command()
	/* no changes */

lio()
	result = ops->pr_register()
	if (result > 0) { 
		/* add some stacked helper again that goes through dm and
		 * to the low level device
		 */
		if (bdev == SCSI) {
			result = scsi_result_to_blk_status(result)
		else
			result = nvme_error_status(result)

This looks simple, but it felt wrong having upper layers having to
know the device type and calling conversion functions.