On 2/5/24 11:19, Ming Lei wrote: >>>> +static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug, >>>> + struct bio *bio, unsigned int nr_segs) >>>> +{ >>>> + /* >>>> + * Keep a reference on the BIO request queue usage. This reference will >>>> + * be dropped either if the BIO is failed or after it is issued and >>>> + * completes. >>>> + */ >>>> + percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter); >>> >>> It is fragile to get nested usage_counter, and same with grabbing/releasing it >>> from different contexts or even functions, and it could be much better to just >>> let block layer maintain it. >>> >>> From patch 23's change: >>> >>> + * Zoned block device information. Reads of this information must be >>> + * protected with blk_queue_enter() / blk_queue_exit(). Modifying this >>> >>> Anytime if there is in-flight bio, the block device is opened, so both gendisk and >>> request_queue are live, so not sure if this .q_usage_counter protection >>> is needed. >> >> Hannes also commented about this. Let me revisit this. > > I think only queue re-configuration(blk_revalidate_zone) requires the > queue usage counter. Otherwise, bdev open()/close() should work just > fine. I want to check FS case though. No clear if mounting FS that supports zone (btrfs) also uses bdev open ? >>>> + /* >>>> + * blk-mq devices will reuse the reference on the request queue usage >>>> + * we took when the BIO was plugged, but the submission path for >>>> + * BIO-based devices will not do that. So drop this reference here. >>>> + */ >>>> + if (bio->bi_bdev->bd_has_submit_bio) >>>> + blk_queue_exit(bio->bi_bdev->bd_disk->queue); >>> >>> But I don't see where this reference is reused for blk-mq in this patch, >>> care to point it out? >> >> This patch modifies blk_mq_submit_bio() to add a "goto new_request" at the top >> for any BIO flagged with BIO_FLAG_ZONE_WRITE_PLUGGING. So when a plugged BIO is >> unplugged and submitted again, the reference that was taken in >> blk_zone_wplug_add_bio() is reused for the new request for that BIO. > > OK, this reference reuse may be worse, because queue freeze can't prevent new > write zoned bio from being submitted any more given only percpu_ref_get() is > called for all write zoned bios. New BIOs (BIOS that have never been plugged yet) will go through the normal blk_queue_enter() in blk_mq_submit_bio(), so they will be stopped there if another context asked for a queue freeze. I do not think there is any issue with how things are currently done (we tested that *a lot* with many different drives and drive configs with DM etc). Reference counting as it is is OK, even though it most likely be simplified. I am looking at that now. -- Damien Le Moal Western Digital Research