Re: [PATCH v2 7/7] reftable/block: avoid decoding keys when searching restart points

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 02, 2024 at 11:47:16AM -0500, Justin Tobler wrote:
> On 24/03/25 11:11AM, Patrick Steinhardt wrote:
> > When searching over restart points in a block we decode the key of each
> > of the records, which results in a memory allocation. This is quite
> > pointless though given that records it restart points will never use
> > prefix compression and thus store their keys verbatim in the block.
> > 
> > Refactor the code so that we can avoid decoding the keys, which saves us
> > some allocations.
> 
> Out of curiousity, do you have any benchmarks around this change and
> would that be something we would want to add to the commit message?

I don't have a benchmark. The problem is that the difference isn't
really measureable when doing a single seek, only, because seeks are
simply too fast. The only usecase where I know that there are a ton of
of record seeks are writes, but here the performance improvement is
getting drowned out by everything else.

You can try to measure allocations and indeed see a difference. But
again, this is getting drowned out by the noise for writes. With my
block reader refactorings (ps/reftable-block-iteration-optim) you can
see the difference when iterating through refs. Before:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 314 allocs, 189 frees, 106,035 bytes allocated

After:

  HEAP SUMMARY:
      in use at exit: 13,603 bytes in 125 blocks
    total heap usage: 303 allocs, 178 frees, 105,763 bytes allocated

But yeah, it's nothing that'd make you go "Oh, wow!". As said, it will
add up when doing many seeks, but I didn't manage to find a proper
benchamrk yet that would be worthy to make it into the commit message.

Patrick

> -Justin
> 
> > 
> > Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
> > ---
> >  reftable/block.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/reftable/block.c b/reftable/block.c
> > index ca80a05e21..8bb4e43cec 100644
> > --- a/reftable/block.c
> > +++ b/reftable/block.c
> > @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args)
> >  		.buf = args->reader->block.data + off,
> >  		.len = args->reader->block_len - off,
> >  	};
> > -	struct strbuf kth_restart_key = STRBUF_INIT;
> > -	uint8_t unused_extra;
> > -	int result, n;
> > +	uint64_t prefix_len, suffix_len;
> > +	uint8_t extra;
> > +	int n;
> >  
> >  	/*
> > -	 * TODO: The restart key is verbatim in the block, so we can in theory
> > -	 * avoid decoding the key and thus save some allocations.
> > +	 * Records at restart points are stored without prefix compression, so
> > +	 * there is no need to fully decode the record key here. This removes
> > +	 * the need for allocating memory.
> >  	 */
> > -	n = reftable_decode_key(&kth_restart_key, &unused_extra, in);
> > -	if (n < 0) {
> > +	n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra);
> > +	if (n < 0 || prefix_len) {
> >  		args->error = 1;
> >  		return -1;
> >  	}
> >  
> > -	result = strbuf_cmp(&args->needle, &kth_restart_key);
> > -	strbuf_release(&kth_restart_key);
> > -	return result < 0;
> > +	string_view_consume(&in, n);
> > +	if (suffix_len > in.len) {
> > +		args->error = 1;
> > +		return -1;
> > +	}
> > +
> > +	n = memcmp(args->needle.buf, in.buf,
> > +		   args->needle.len < suffix_len ? args->needle.len : suffix_len);
> > +	if (n)
> > +		return n < 0;
> > +	return args->needle.len < suffix_len;
> >  }
> >  
> >  void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)
> > -- 
> > 2.44.GIT
> > 
> 
> 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux