On 28/04/2020 13:42, Hannes Reinecke wrote: [...] >> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h >> index 50fff0bf8c8e..6009311105ef 100644 >> --- a/drivers/scsi/sd.h >> +++ b/drivers/scsi/sd.h >> @@ -79,6 +79,12 @@ struct scsi_disk { >> u32 zones_optimal_open; >> u32 zones_optimal_nonseq; >> u32 zones_max_open; >> + u32 *zones_wp_ofst; >> + spinlock_t zones_wp_ofst_lock; >> + u32 *rev_wp_ofst; >> + struct mutex rev_mutex; >> + struct work_struct zone_wp_ofst_work; >> + char *zone_wp_update_buf; >> #endif >> atomic_t openers; >> sector_t capacity; /* size in logical blocks */ > > 'zones_wp_ofst' ? > > Please replace the cryptic 'ofst' with 'offset'; those three additional > characters don't really make a difference ... 'zones_wp_ofst' was good to maintain the 80 chars limit and not end up with broken up lines, because we crossed the limit. I'll have a look if we can make it 'zones_wp_offset' without uglifying the code. [...] >> @@ -396,11 +633,67 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf, >> return 0; >> } >> >> +static void sd_zbc_revalidate_zones_cb(struct gendisk *disk) >> +{ >> + struct scsi_disk *sdkp = scsi_disk(disk); >> + >> + swap(sdkp->zones_wp_ofst, sdkp->rev_wp_ofst); >> +} >> + >> +static int sd_zbc_revalidate_zones(struct scsi_disk *sdkp, >> + u32 zone_blocks, >> + unsigned int nr_zones) >> +{ >> + struct gendisk *disk = sdkp->disk; >> + int ret = 0; >> + >> + /* >> + * Make sure revalidate zones are serialized to ensure exclusive >> + * updates of the scsi disk data. >> + */ >> + mutex_lock(&sdkp->rev_mutex); >> + >> + /* >> + * Revalidate the disk zones to update the device request queue zone >> + * bitmaps and the zone write pointer offset array. Do this only once >> + * the device capacity is set on the second revalidate execution for >> + * disk scan or if something changed when executing a normal revalidate. >> + */ >> + if (sdkp->first_scan) { >> + sdkp->zone_blocks = zone_blocks; >> + sdkp->nr_zones = nr_zones; >> + goto unlock; >> + } >> + >> + if (sdkp->zone_blocks == zone_blocks && >> + sdkp->nr_zones == nr_zones && >> + disk->queue->nr_zones == nr_zones) >> + goto unlock; >> + >> + sdkp->rev_wp_ofst = kvcalloc(nr_zones, sizeof(u32), GFP_NOIO); >> + if (!sdkp->rev_wp_ofst) { >> + ret = -ENOMEM; >> + goto unlock; >> + } >> + >> + ret = blk_revalidate_disk_zones(disk, sd_zbc_revalidate_zones_cb); >> + >> + kvfree(sdkp->rev_wp_ofst); >> + sdkp->rev_wp_ofst = NULL; >> + >> +unlock: >> + mutex_unlock(&sdkp->rev_mutex); > > I don't really understand this. > Passing a callback is fine if things happen asynchronously, and you > wouldn't know from the calling context when that happened. Ok. > But the above code definitely assumes that blk_revalidate_disk_zones() > will be completed upon return, otherwise we'll get a nice crash in the > callback function as the 'rev' pointer is invalid. > But _if_ blk_revalidata_disk_zones() has completed upon return we might > as well kill the callback, have the ->rev_wp_ofst a local variable ans > simply the whole thing. Sorry but I don't understand your comment. If in blk_revalidate_disk_zones() returns an error, all that happens is that we free the rev_wp_ofst pointer and return the error to the caller. And looking at blk_revalidate_disk_zones() itself, I can't see a code path that calls the callback if something went wrong: noio_flag = memalloc_noio_save(); ret = disk->fops->report_zones(disk, 0, UINT_MAX, blk_revalidate_zone_cb, &args); memalloc_noio_restore(noio_flag); /* * Install the new bitmaps and update nr_zones only once the queue is * stopped and all I/Os are completed (i.e. a scheduler is not * referencing the bitmaps). */ blk_mq_freeze_queue(q); if (ret >= 0) { blk_queue_chunk_sectors(q, args.zone_sectors); q->nr_zones = args.nr_zones; swap(q->seq_zones_wlock, args.seq_zones_wlock); swap(q->conv_zones_bitmap, args.conv_zones_bitmap); if (update_driver_data) update_driver_data(disk); ret = 0; } else { pr_warn("%s: failed to revalidate zones\n", disk->disk_name); blk_queue_free_zone_bitmaps(q); } blk_mq_unfreeze_queue(q); And even *iff* the callback would be executed, we would have: static void sd_zbc_revalidate_zones_cb(struct gendisk *disk) { struct scsi_disk *sdkp = scsi_disk(disk); swap(sdkp->zones_wp_ofst, sdkp->rev_wp_ofst); } I.e. we exchange some pointers. I can't see a possible crash here, we're not accessing anything. Byte, Johannes