Re: block: remove submit_bio_noacct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 03, 2023 at 11:39:36AM -0500, Mike Snitzer wrote:
> On Fri, Feb 03 2023 at 10:00P -0500,
> Christoph Hellwig <hch@xxxxxx> wrote:
> 
> > On Thu, Feb 02, 2023 at 10:16:53PM -0500, Mike Snitzer wrote:
> > > > > The current usage of submit_bio vs submit_bio_noacct which skips the
> > > > > VM events and task account is a bit unclear.  It seems to be mostly
> > > > > intended for sending on bios received by stacking drivers, but also
> > > > > seems to be used for stacking drivers newly generated metadata
> > > > > sometimes.
> > > > 
> > > > Your lack of confidence conveyed in the above shook me a little bit
> > > > considering so much of this code is attributed to you -- I mostly got
> > > > past that, but I am a bit concerned about one aspect of the
> > > > submit_bio() change (2nd to last comment inlined below).
> > 
> > The confidence is about how it is used.  And that's up to the driver
> > authors, not helped by them not having any guidelines.  And while
> > I've touched this code a lot, the split between the two levels of API
> > long predates me.
> > 
> > > > > Remove the separate API and just skip the accounting if submit_bio
> > > > > is called recursively.  This gets us an accounting behavior that
> > > > > is very similar (but not quite identical) to the current one, while
> > > > > simplifying the API and code base.
> > > > 
> > > > Can you elaborate on the "but not quite identical"? This patch is
> > > > pretty mechanical, just folding code and renaming.. but you obviously
> > > > saw subtle differences.  Likely worth callign those out precisely.
> > 
> > The explanation was supposed to be in the Lines above.  Now accounting
> > is skipped if in a ->submit_bio recursion.  Before that it dependent
> > on drivers calling either submit_bio or submit_bio_noacct, for which
> > there was no clear guideline and drivers have been a bit sloppy about.
> 
> OK, but afaict the code is identical after your refactoring.
> Side-effect is drivers that were double accounting will now be fixed.

Doesn't this open dm up to double accounting? Take dm-delay for
instance. If the delay is 0, then submit_bio() for the underlying device
will be called recursively and will skip the accounting.  If the delay
isn't 0, then submit_bio() for the underlying device will be called
later from a workqueue and the accounting will happen again, even though
the same amount of IO happens in either case. Or am I missing something
here?

-Ben 
  
> > > > 
> > > > How have you tested this patch?  Seems like I should throw all the lvm
> > > > and DM tests at it.
> > 
> > blktests and the mdadm tests (at least as far as they got upstream, md
> > or it's tests always seem somewhat broken).  dmtests is something
> > I've never managed to get to actually run due it's insistence on
> > using not packaged ruby stuff.
> 
> Yeah, device-mapper-test-suite (dmts) is a PITA due to ruby dep. And
> the tests have gotten a bit stale relative to recent kernels. I'm
> aware of this and also about how others would like to see more DM
> coverage in blktests. We'll be looking to improve DM testing but it
> does always tend to get put on back-burner (but I'll be getting some
> help from Ben Marzinski to assess what tests we do have, see where we
> have gaps and also put effort to making DM testing part of blktests).
> 
> I'm actually now pretty interested to see which (if any) DM tests
> would have caught the missing bio checks issue in your initial patch.
>  
> > > > In practice this will manifest as delaying the negative checks, until
> > > > returning from active submit_bio, but they will still happen.
> > > 
> > > Actually, I don't think those checks are done at all now.
> > 
> > Yes, the branch needs to be later as in this version below.
> 
> Thanks.
> 
> > > Unless I'm missing something, this seems like it needs proper
> > > justification and a lot of review and testing.
> > > 
> > > So why do this change?
> > 
> > Because I once again got a question from an auther of a driver that
> > is planned to be upstreamed on which one to use.  And the answer
> > was it's complicated, and you really should not have to think about
> > it, let me dig out my old patch so that driver authors don't have
> > to care.
> 
> That's fair, and as I tried to say in my first reply: I agree it does
> clear up that confusion nicely.
> 
> Please fold this incremental patch in, with that you can add:
> 
> Reviewed-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> 
> (not sure if my Signed-off-by needed but there you have it if so).
> 
> diff --git a/block/bio.c b/block/bio.c
> index ea143fd825d7..aa0586012b0d 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -475,7 +475,7 @@ static struct bio *bio_alloc_percpu_cache(struct block_device *bdev,
>   *
>   * Note that when running under submit_bio() (i.e. any block driver),
>   * bios are not submitted until after you return - see the code in
> - * submit_bio() that converts recursion into iteration, to prevent
> + * submit_bio_nocheck() that converts recursion into iteration, to prevent
>   * stack overflows.
>   *
>   * This would normally mean allocating multiple bios under submit_bio()
> diff --git a/block/blk-core.c b/block/blk-core.c
> index f755ac1a2931..c41f086cb183 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -687,7 +687,7 @@ void submit_bio_nocheck(struct bio *bio)
>  	/*
>  	 * We only want one ->submit_bio to be active at a time, else stack
>  	 * usage with stacked devices could be a problem.  Use current->bio_list
> -	 * to collect a list of requests submited by a ->submit_bio method while
> +	 * to collect a list of requests submitted by a ->submit_bio method while
>  	 * it is active, and then process them after it returned.
>  	 */
>  	if (current->bio_list)
> @@ -720,13 +720,13 @@ void submit_bio(struct bio *bio)
>  
>  	might_sleep();
>  
> -	if (blkcg_punt_bio_submit(bio))
> -		return;
> -
>  	/*
>  	 * Do not double account bios that are remapped and resubmitted.
>  	 */
>  	if (!current->bio_list) {
> +		if (blkcg_punt_bio_submit(bio))
> +			return;
> +
>  		if (bio_op(bio) == REQ_OP_READ) {
>  			task_io_account_read(bio->bi_iter.bi_size);
>  			count_vm_events(PGPGIN, bio_sectors(bio));




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux