Re: [PATCH 6/6] xfs: online scrub needn't bother zeroing its temporary buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 05, 2019 at 01:26:39PM -0400, Brian Foster wrote:
> On Fri, Jul 05, 2019 at 09:35:04AM -0700, Darrick J. Wong wrote:
> > On Fri, Jul 05, 2019 at 10:52:46AM -0400, Brian Foster wrote:
> > > On Wed, Jun 26, 2019 at 01:47:10PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > 
> > > > The xattr scrubber functions use the temporary memory buffer either for
> > > > storing bitmaps or for testing if attribute value extraction works.  The
> > > > bitmap code always zeroes what it needs and the value extraction merely
> > > > sets the buffer contents (we never read the contents, we just look for
> > > > return codes), so it's not necessary to waste CPU time zeroing on
> > > > allocation.
> > > > 
> > > 
> > > If we don't need to zero the buffer because we never look at the result,
> > > that suggests we don't need to populate it in the first place right?
> > 
> > We still need to read the attr value into the buffer (at least for
> > remote attr values) because scrub doesn't otherwise check the remote
> > attribute block header.
> > 
> > We never read the contents (because the contents are just arbitrary
> > bytes) but we do need to be able to catch an EFSCORRUPTED if, say, the
> > attribute dabtree points at a corrupt block.
> > 
> 
> Ok.. what I'm getting at here is basically wondering if since the buffer
> zeroing was noticeable in performance traces, whether the xattr value
> memory copy might be similarly noticeable for certain datasets (many
> large xattrs?). I suppose that may be less prominent if the buffer
> alloc/zero was unconditional as opposed to tied to the existence of an
> actual xattr, but that doesn't necessarily mean the performance impact
> is zero.
> 
> If non-zero, it might be interesting to explore whether some sort of
> lookup interface makes sense for xattrs that essentially do everything
> we currently do via xfs_attr_get() except read the attr. Presumably we
> could avoid the memory copy along with the buffer allocation in that
> case. But that's just a random thought for future consideration,
> certainly not low handing fruit as is this patch. If you have a good
> scrub performance test, an easy experiment might be to run it with a
> hack to skip the buffer allocation, pass a NULL buffer and
> conditionalize the ->value accesses/copies in the xattr code to avoid
> explosions and see whether there's any benefit.

Ahhh, yes.  Currently for flame graph analysis I just use perf record +
Brendan Gregg's flamegraph tools to spit out a svg and then go digging
into any call stack is wide and not especially conical.  I hadn't really
noticed the actual attr value copyout but that's only because it tends
to get lost in the noise of parsing through attr leaves and whatnot.

However, it does sound like a nice shortcut to be able to set
xfs_da_args.value = NULL and have the attr value code go through the
motions of extracting the value but skipping the memcpy part.

Will put this on my list of things to study for 5.4. :)

--D

> > > > A flame graph analysis showed that we were spending 7% of a xfs_scrub
> > > > run (the whole program, not just the attr scrubber itself) allocating
> > > > and zeroing 64k segments needlessly.
> > > > 
> > > 
> > > How much does this patch help?
> > 
> > About 1-2% I think.  FWIW the "7%" figure represents the smallest
> > improvement I saw in runtimes, where allocation ate 1-2% of the runtime
> > and zeroing accounts for the rest (~5-6%).
> > 
> > Practically speaking, when I retested with NVME flash instead of
> > spinning rust then the improvement jumped to 15-20% overall.
> > 
> 
> Nice!
> 
> Brian
> 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > ---
> > > >  fs/xfs/scrub/attr.c |    7 ++++++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c
> > > > index 09081d8ab34b..d3a6f3dacf0d 100644
> > > > --- a/fs/xfs/scrub/attr.c
> > > > +++ b/fs/xfs/scrub/attr.c
> > > > @@ -64,7 +64,12 @@ xchk_setup_xattr_buf(
> > > >  		sc->buf = NULL;
> > > >  	}
> > > >  
> > > > -	ab = kmem_zalloc_large(sizeof(*ab) + sz, flags);
> > > > +	/*
> > > > +	 * Allocate the big buffer.  We skip zeroing it because that added 7%
> > > > +	 * to the scrub runtime and all the users were careful never to read
> > > > +	 * uninitialized contents.
> > > > +	 */
> > > 
> > > Ok, that suggests the 7% hit was due to zeroing (where the commit log
> > > says "allocating and zeroing"). Either way, we probably don't need such
> > > details in the code. Can we tweak the comment to something like:
> > > 
> > > /*
> > >  * Don't zero the buffer on allocation to avoid runtime overhead. All
> > >  * users must be careful never to read uninitialized contents.
> > >  */ 
> > 
> > Ok, I'll do that.
> > 
> > Thanks for all the review! :)
> > 
> > --D
> > 
> > > 
> > > With that:
> > > 
> > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>
> > > 
> > > > +	ab = kmem_alloc_large(sizeof(*ab) + sz, flags);
> > > >  	if (!ab)
> > > >  		return -ENOMEM;
> > > >  
> > > > 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux