Re: [PATCH v11 04/16] scsi: core: Introduce a mechanism for reordering requests in the error handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/22/23 23:26, Hannes Reinecke wrote:
On 8/22/23 21:16, Bart Van Assche wrote:
+/*
+ * Comparison function that allows to sort SCSI commands by ULD driver.
+ */
+static int scsi_cmp_uld(void *priv, const struct list_head *_a,
+            const struct list_head *_b)
+{
+    struct scsi_cmnd *a = list_entry(_a, typeof(*a), eh_entry);
+    struct scsi_cmnd *b = list_entry(_b, typeof(*b), eh_entry);
+
+    /* See also the comment above the list_sort() definition. */
+    return scsi_cmd_to_driver(a) > scsi_cmd_to_driver(b);

I have to agree with Christoph here.
Comparing LBA numbers at the SCSI level is really the wrong place.
SCSI commands might be anything, and quite some of these commands don't
even have LBA numbers. So trying to order them will be pointless.

The reordering mechanism really has to go into the block layer, with
the driver failing the request and the block layer resubmitting in-order.

Hi Hannes,

Please take another look at patches 04/16 and 05/16. As one can see no
LBA numbers are being compared in the SCSI core - comparing LBA numbers
happens in the sd (SCSI disk) driver. The code that you replied to
compares ULD pointers, a well-defined concept in the SCSI core.

Your request to move the functionality from patches 04/16 and 05/16 into
the block layer would involve the following:
* Report the unaligned write errors (because a write did not happen at the
  write pointer) to the block layer (BLK_STS_WP_MISMATCH?).
* Introduce a mechanism in the block layer for postponing error handling
  until all outstanding commands have failed. The approach from the SCSI
  core (tracking the number of failed and the number of busy commands
  and only waking up the error handler after these counters are equal)
  would be unacceptable because of the runtime overhead this mechanism
  would introduce in the block layer hot path. Additionally, I strongly
  doubt that it is possible to introduce any mechanism for postponing
  error handling in the block layer without introducing additional
  overhead in the hot path.
* Christoph's opinion is that NVMe software should use zone append
  (REQ_OP_ZONE_APPEND) instead of regular writes (REQ_OP_WRITE) when
  writing to a zoned namespace. So the SCSI subsystem would be the only
  user of the new mechanism introduced in the block layer. The reason we
  chose REQ_OP_WRITE for zoned UFS devices is because the SCSI standard
  does not support a zone append command and introducing a zone append
  command in the SCSI standards is not something that can be realized in
  time for the first generation of zoned UFS devices.

Because I assume that both Jens and Christoph disagree strongly with your
request: I have no plans to move the code for sorting zoned writes into
the block layer core.

Jens and Christoph, please correct me if I misunderstood something.

Thanks,

Bart.



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux