Re: [PATCH 06/26] block: Introduce zone write plugging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 05, 2024 at 11:41:04AM +0900, Damien Le Moal wrote:
> On 2/5/24 11:19, Ming Lei wrote:
> >>>> +static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug,
> >>>> +					  struct bio *bio, unsigned int nr_segs)
> >>>> +{
> >>>> +	/*
> >>>> +	 * Keep a reference on the BIO request queue usage. This reference will
> >>>> +	 * be dropped either if the BIO is failed or after it is issued and
> >>>> +	 * completes.
> >>>> +	 */
> >>>> +	percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
> >>>
> >>> It is fragile to get nested usage_counter, and same with grabbing/releasing it
> >>> from different contexts or even functions, and it could be much better to just
> >>> let block layer maintain it.
> >>>
> >>> From patch 23's change:
> >>>
> >>> +	 * Zoned block device information. Reads of this information must be
> >>> +	 * protected with blk_queue_enter() / blk_queue_exit(). Modifying this
> >>>
> >>> Anytime if there is in-flight bio, the block device is opened, so both gendisk and
> >>> request_queue are live, so not sure if this .q_usage_counter protection
> >>> is needed.
> >>
> >> Hannes also commented about this. Let me revisit this.
> > 
> > I think only queue re-configuration(blk_revalidate_zone) requires the
> > queue usage counter. Otherwise, bdev open()/close() should work just
> > fine.
> 
> I want to check FS case though. No clear if mounting FS that supports zone
> (btrfs) also uses bdev open ?

I feel the following delta change might be cleaner and easily documented:

- one IO takes single reference for both bio based and blk-mq,
- no drop & re-grab
- only grab extra reference for bio based
- two code paths share same pattern

diff --git a/block/blk-core.c b/block/blk-core.c
index 9520ccab3050..118dd789beb5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -597,6 +597,10 @@ static void __submit_bio(struct bio *bio)
 
 	if (!bio->bi_bdev->bd_has_submit_bio) {
 		blk_mq_submit_bio(bio);
+	} else if (bio_zone_write_plugging(bio)) {
+		struct gendisk *disk = bio->bi_bdev->bd_disk;
+
+		disk->fops->submit_bio(bio);
 	} else if (likely(bio_queue_enter(bio) == 0)) {
 		struct gendisk *disk = bio->bi_bdev->bd_disk;
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f0fc61a3ec81..fc6d792747dc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3006,8 +3006,12 @@ void blk_mq_submit_bio(struct bio *bio)
 	if (blk_mq_attempt_bio_merge(q, bio, nr_segs))
 		goto queue_exit;
 
+	/*
+	 * Grab one reference for plugged zoned write and it will be reused in
+	 * next real submission
+	 */
 	if (blk_queue_is_zoned(q) && blk_zone_write_plug_bio(bio, nr_segs))
-		goto queue_exit;
+		return;
 
 	if (!rq) {
 new_request:
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index f6d4f511b664..87abb3f7ef30 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -514,7 +514,8 @@ static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug,
 	 * be dropped either if the BIO is failed or after it is issued and
 	 * completes.
 	 */
-	percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
+	if (bio->bi_bdev->bd_has_submit_bio)
+		percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
 
 	/*
 	 * The BIO is being plugged and thus will have to wait for the on-going
@@ -760,15 +761,10 @@ static void blk_zone_wplug_bio_work(struct work_struct *work)
 
 	blk_zone_wplug_unlock(zwplug, flags);
 
-	/*
-	 * blk-mq devices will reuse the reference on the request queue usage
-	 * we took when the BIO was plugged, but the submission path for
-	 * BIO-based devices will not do that. So drop this reference here.
-	 */
+	submit_bio_noacct_nocheck(bio);
+
 	if (bio->bi_bdev->bd_has_submit_bio)
 		blk_queue_exit(bio->bi_bdev->bd_disk->queue);
-
-	submit_bio_noacct_nocheck(bio);
 }
 
 static struct blk_zone_wplug *blk_zone_alloc_write_plugs(unsigned int nr_zones)

Thanks,
Ming





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux