Re: [PATCH v5 02/17] pack-mtimes: support reading .mtimes files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 25, 2022 at 12:48:54AM -0700, Jonathan Nieder wrote:
> >> What does mtimes_map contain?  A comment would help.
> >
> > It contains a pointer at the beginning of the mmapped region of the
> > .mtimes file, similar to revindex_map above it.
>
> To be clear, in cases like this by "comment" I mean "in-code comment".
> I.e., my interest is not that _I_ find out the answer but that the
> code becomes more maintainable via the answer becoming easier to find.

OK. I'll add a comment in the fixup! patch which I'm about to send.

> [...]
> >> This seems simple enough that it's not obvious we need more code
> >> sharing.  Do you agree?  If so, I'd suggest just removing the
> >> NEEDSWORK comment.
> >
> > Yeah, it is conceptually simple, though it feels like the sort of thing
> > that could benefit from not having to be written once for each
> > extension (hence the comment).
>
> The reason I asked is that the NEEDSWORK here actually got in the way
> of comprehension for me --- it made me wonder "is there some
> complexity here I'm missing?"
>
> That's why I'd suggest one of
> - removing the NEEDSWORK comment
> - going ahead and implementing the code sharing you mean, or
> - fleshing out the NEEDSWORK comment so the reader can wonder less

I am a little sad to remove it, since I thought it was useful as-is. But
I can just as easily remember to come back to this myself in the future,
so if it is distracting to you in the meantime, then I don't mind
holding onto it in my own head.

> >>> +
> >>> +#define MTIMES_HEADER_SIZE (12)
> >>> +#define MTIMES_MIN_SIZE (MTIMES_HEADER_SIZE + (2 * the_hash_algo->rawsz))
> >>
> >> Hm, the all-caps name makes this feel like a compile-time constant but
> >> it contains a reference to the_hash_algo.  Could it be an inline
> >> function instead?
> >
> > Yes, it could be an inline function, but I don't think there is
> > necessarily anything wrong with it being a #define'd macro. There are
> > some other examples, e.g., RIDX_MIN_SIZE, MIDX_MIN_SIZE,
> > GRAPH_DATA_WIDTH, and PACK_SIZE_THRESHOLD (to name a few) which also use
> > the_hash_algo on the right-hand side of a `#define`.
>
> Those are due to an incomplete migration from use of the true constant
> GIT_SHA1_RAWSZ to use of the dynamic value the_hash_algo->rawsz, no?
> In other words, "other examples do it wrong" doesn't feel like a great
> justification for making it worse in new code.

Fair point. I can imagine reasons for the existing pattern, but updating
it to handle the variable rawsz is easy to do (and it probably should
have been that way since the beginning).

> [...]
> >>> +static int load_pack_mtimes_file(char *mtimes_file,
> >>> +				 uint32_t num_objects,
> >>> +				 const uint32_t **data_p, size_t *len_p)
> >>
> >> What does this function do?  A comment would help.
> >
> > I know that I'm biased as the author of this code, but I think the
> > signature is clear here. At least, I'm not sure what information a
> > comment would add that the function name and its arguments don't already
> > convey.
>
> Ah, thanks for this point of clarification.  What isn't clear from the
> signature is
> - when should I call this function?
> - what does its return value represent?
> - how does it handle errors?
>
> I agree that the parameters are self-explanatory.

I'm hesitant to over-document a static function with a single caller,
but when looking at this, I think there is an opportunity to document
_its_ caller (`load_pack_mtimes()`) which isn't static, but was also
missing documentation.

> >>> +cleanup:
> >>> +	if (ret) {
> >>> +		if (data)
> >>> +			munmap(data, mtimes_size);
> >>> +	} else {
> >>> +		*len_p = mtimes_size;
> >>> +		*data_p = (const uint32_t *)data;
> >>
> >> Do we know that 'data' is uint32_t aligned?  Casting earlier in the
> >> function could make that more obvious.
> >
> > `data` is definitely uint32_t aligned, but this is a tradeoff, since if
> > we wrote:
> >
> >     uint32_t *data = xmmap(...);
> >
> > then I think we would have to change the case where ret is non-zero to be:
> >
> >     if (data)
> >         munmap((void*)data, ...);
> >
> > and likewise, data_p is const.
>
> Doing it that way sounds great to me.  That way, the type contains the
> information we need up-front and the safety of the cast is obvious in
> the place where the cast is needed.
>
> (Although my understanding is also that in C it's fine to pass a
> uint32_t* to a function expecting a void*, so the second cast would
> also not be needed.)
>
> [...]

Done, thanks for the suggestion.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux