Re: [PATCH v9 08/11] scsi: sd_zbc: emulate ZONE_APPEND commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-04-28 8:09 a.m., Johannes Thumshirn wrote:
On 28/04/2020 13:42, Hannes Reinecke wrote:
[...]
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 50fff0bf8c8e..6009311105ef 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -79,6 +79,12 @@ struct scsi_disk {
    	u32		zones_optimal_open;
    	u32		zones_optimal_nonseq;
    	u32		zones_max_open;
+	u32		*zones_wp_ofst;
+	spinlock_t	zones_wp_ofst_lock;
+	u32		*rev_wp_ofst;
+	struct mutex	rev_mutex;
+	struct work_struct zone_wp_ofst_work;
+	char		*zone_wp_update_buf;
    #endif
    	atomic_t	openers;
    	sector_t	capacity;	/* size in logical blocks */

'zones_wp_ofst' ?

Please replace the cryptic 'ofst' with 'offset'; those three additional
characters don't really make a difference ...

'zones_wp_ofst' was good to maintain the 80 chars limit and not end up
with broken up lines, because we crossed the limit. I'll have a look if
we can make it 'zones_wp_offset' without uglifying the code.

When stuck like that, I have started using "_o" as in "lba_o" for
LBA's octet offset :-)

Good rule of thumb: if you can't write "a = b + c" (a couple of tab
indentations in) then your variable names are too damn long.

Of course the Linux kernel dictating indent_to_8 and restricting lines
to punched card width doesn't help.

Doug Gilbert

[...]

@@ -396,11 +633,67 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
    	return 0;
    }
+static void sd_zbc_revalidate_zones_cb(struct gendisk *disk)
+{
+	struct scsi_disk *sdkp = scsi_disk(disk);
+
+	swap(sdkp->zones_wp_ofst, sdkp->rev_wp_ofst);
+}
+
+static int sd_zbc_revalidate_zones(struct scsi_disk *sdkp,
+				   u32 zone_blocks,
+				   unsigned int nr_zones)
+{
+	struct gendisk *disk = sdkp->disk;
+	int ret = 0;
+
+	/*
+	 * Make sure revalidate zones are serialized to ensure exclusive
+	 * updates of the scsi disk data.
+	 */
+	mutex_lock(&sdkp->rev_mutex);
+
+	/*
+	 * Revalidate the disk zones to update the device request queue zone
+	 * bitmaps and the zone write pointer offset array. Do this only once
+	 * the device capacity is set on the second revalidate execution for
+	 * disk scan or if something changed when executing a normal revalidate.
+	 */
+	if (sdkp->first_scan) {
+		sdkp->zone_blocks = zone_blocks;
+		sdkp->nr_zones = nr_zones;
+		goto unlock;
+	}
+
+	if (sdkp->zone_blocks == zone_blocks &&
+	    sdkp->nr_zones == nr_zones &&
+	    disk->queue->nr_zones == nr_zones)
+		goto unlock;
+
+	sdkp->rev_wp_ofst = kvcalloc(nr_zones, sizeof(u32), GFP_NOIO);
+	if (!sdkp->rev_wp_ofst) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	ret = blk_revalidate_disk_zones(disk, sd_zbc_revalidate_zones_cb);
+
+	kvfree(sdkp->rev_wp_ofst);
+	sdkp->rev_wp_ofst = NULL;
+
+unlock:
+	mutex_unlock(&sdkp->rev_mutex);

I don't really understand this.
Passing a callback is fine if things happen asynchronously, and you
wouldn't know from the calling context when that happened. Ok.
But the above code definitely assumes that blk_revalidate_disk_zones()
will be completed upon return, otherwise we'll get a nice crash in the
callback function as the 'rev' pointer is invalid.
But _if_ blk_revalidata_disk_zones() has completed upon return we might
as well kill the callback, have the ->rev_wp_ofst a local variable ans
simply the whole thing.

Sorry but I don't understand your comment. If in
blk_revalidate_disk_zones() returns an error, all that happens is that
we free the rev_wp_ofst pointer and return the error to the caller.

And looking at blk_revalidate_disk_zones() itself, I can't see a code
path that calls the callback if something went wrong:

noio_flag = memalloc_noio_save();
ret = disk->fops->report_zones(disk, 0, UINT_MAX,
                                 blk_revalidate_zone_cb, &args);
memalloc_noio_restore(noio_flag);
/*
   * Install the new bitmaps and update nr_zones only once the queue is
   * stopped and all I/Os are completed (i.e. a scheduler is not
   * referencing the bitmaps).
   */
blk_mq_freeze_queue(q);
if (ret >= 0) {
          blk_queue_chunk_sectors(q, args.zone_sectors);
          q->nr_zones = args.nr_zones;
          swap(q->seq_zones_wlock, args.seq_zones_wlock);
          swap(q->conv_zones_bitmap, args.conv_zones_bitmap);
          if (update_driver_data)
                  update_driver_data(disk);
          ret = 0;
} else {
          pr_warn("%s: failed to revalidate zones\n", disk->disk_name);
          blk_queue_free_zone_bitmaps(q);
}
blk_mq_unfreeze_queue(q);

And even *iff* the callback would be executed, we would have:
static void sd_zbc_revalidate_zones_cb(struct gendisk *disk)
{
          struct scsi_disk *sdkp = scsi_disk(disk);

          swap(sdkp->zones_wp_ofst, sdkp->rev_wp_ofst);
}

I.e. we exchange some pointers. I can't see a possible crash here, we're
not accessing anything.

Byte,
	Johannes





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux