Re: [PATCH 01/20] pack-revindex: introduce a new API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 08, 2021 at 01:16:43PM -0500, Taylor Blau wrote:

> In the next several patches, we will prepare for loading a reverse index
> either in memory, or from a yet-to-be-introduced on-disk format. To do
> that, we'll introduce an API that avoids the caller explicitly indexing
> the revindex pointer in the packed_git structure.

This API looks good to me. Here are a few extra thoughts:

> There are four ways to interact with the reverse index. Accordingly,
> four functions will be exported from 'pack-revindex.h' by the time that
> the existing API is removed. A caller may:

This tells us what the new API functions do. That's useful, but should
it be in the header file itself, documenting each function?

Likewise, I think we'd want to define the concepts in that
documentation. Something like:


  /*
   * A revindex allows converting efficiently between three properties
   * of an object within a pack:
   *
   *  - index position: the numeric position within the list of
   *    sorted object ids found in the .idx file
   *
   *  - pack position: the numeric position within the list of objects
   *    in their order within the actual .pack file (i.e., 0 is the
   *    first object in the .pack, 1 is the second, and so on)
   *
   *  - offset: the byte offset within the .pack file at which the
   *    object contents can be found
   */

And then above each function we can just say that it converts X to Y
(like you have in the commit message). It may also be worth indicating
the run-time of each (some of them are constant-time once you have a
revindex, and some are log(n)).

> +int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)

The types here make sense. off_t is clearly needed for a pack offset,
and uint32_t is correct for the position fields, because packs have a
4-byte object count.

Separating the error return from the out-parameter makes the interface
slightly more awkward, but is needed to use the properly-sized types.
Makes sense.

> +int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
> +{
> +	int ret;
> +
> +	if (load_pack_revindex(p) < 0)
> +		return -1;

This one lazy-loads the revindex for us, which seems handy...

> +uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos)
> +{
> +	if (!p->revindex)
> +		BUG("pack_pos_to_index: reverse index not yet loaded");
> +	if (pos >= p->num_objects)
> +		BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos);
> +	return p->revindex[pos].nr;
> +}

But these ones don't. I'm glad we at least catch it with a BUG(), but it
makes the API a little funny. Returning an error here would require a
similarly awkward out-parameter, I guess.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux