RE: [PATCH v5 02/17] pack-mtimes: support reading .mtimes files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On May 25, 2022 5:37 PM, Taylor Blau wrote:
>On Wed, May 25, 2022 at 12:48:54AM -0700, Jonathan Nieder wrote:
>> >> What does mtimes_map contain?  A comment would help.
>> >
>> > It contains a pointer at the beginning of the mmapped region of the
>> > .mtimes file, similar to revindex_map above it.
>>
>> To be clear, in cases like this by "comment" I mean "in-code comment".
>> I.e., my interest is not that _I_ find out the answer but that the
>> code becomes more maintainable via the answer becoming easier to find.
>
>OK. I'll add a comment in the fixup! patch which I'm about to send.
>
>> [...]
>> >> This seems simple enough that it's not obvious we need more code
>> >> sharing.  Do you agree?  If so, I'd suggest just removing the
>> >> NEEDSWORK comment.
>> >
>> > Yeah, it is conceptually simple, though it feels like the sort of
>> > thing that could benefit from not having to be written once for each
>> > extension (hence the comment).
>>
>> The reason I asked is that the NEEDSWORK here actually got in the way
>> of comprehension for me --- it made me wonder "is there some
>> complexity here I'm missing?"
>>
>> That's why I'd suggest one of
>> - removing the NEEDSWORK comment
>> - going ahead and implementing the code sharing you mean, or
>> - fleshing out the NEEDSWORK comment so the reader can wonder less
>
>I am a little sad to remove it, since I thought it was useful as-is. But I can just as
>easily remember to come back to this myself in the future, so if it is distracting to
>you in the meantime, then I don't mind holding onto it in my own head.
>
>> >>> +
>> >>> +#define MTIMES_HEADER_SIZE (12)
>> >>> +#define MTIMES_MIN_SIZE (MTIMES_HEADER_SIZE + (2 *
>> >>> +the_hash_algo->rawsz))
>> >>
>> >> Hm, the all-caps name makes this feel like a compile-time constant
>> >> but it contains a reference to the_hash_algo.  Could it be an
>> >> inline function instead?
>> >
>> > Yes, it could be an inline function, but I don't think there is
>> > necessarily anything wrong with it being a #define'd macro. There
>> > are some other examples, e.g., RIDX_MIN_SIZE, MIDX_MIN_SIZE,
>> > GRAPH_DATA_WIDTH, and PACK_SIZE_THRESHOLD (to name a few) which
>also
>> > use the_hash_algo on the right-hand side of a `#define`.
>>
>> Those are due to an incomplete migration from use of the true constant
>> GIT_SHA1_RAWSZ to use of the dynamic value the_hash_algo->rawsz, no?
>> In other words, "other examples do it wrong" doesn't feel like a great
>> justification for making it worse in new code.
>
>Fair point. I can imagine reasons for the existing pattern, but updating it to handle
>the variable rawsz is easy to do (and it probably should have been that way since
>the beginning).
>
>> [...]
>> >>> +static int load_pack_mtimes_file(char *mtimes_file,
>> >>> +				 uint32_t num_objects,
>> >>> +				 const uint32_t **data_p, size_t *len_p)
>> >>
>> >> What does this function do?  A comment would help.
>> >
>> > I know that I'm biased as the author of this code, but I think the
>> > signature is clear here. At least, I'm not sure what information a
>> > comment would add that the function name and its arguments don't
>> > already convey.
>>
>> Ah, thanks for this point of clarification.  What isn't clear from the
>> signature is
>> - when should I call this function?
>> - what does its return value represent?
>> - how does it handle errors?
>>
>> I agree that the parameters are self-explanatory.
>
>I'm hesitant to over-document a static function with a single caller, but when
>looking at this, I think there is an opportunity to document _its_ caller
>(`load_pack_mtimes()`) which isn't static, but was also missing documentation.
>
>> >>> +cleanup:
>> >>> +	if (ret) {
>> >>> +		if (data)
>> >>> +			munmap(data, mtimes_size);
>> >>> +	} else {
>> >>> +		*len_p = mtimes_size;
>> >>> +		*data_p = (const uint32_t *)data;
>> >>
>> >> Do we know that 'data' is uint32_t aligned?  Casting earlier in the
>> >> function could make that more obvious.
>> >
>> > `data` is definitely uint32_t aligned, but this is a tradeoff, since
>> > if we wrote:
>> >
>> >     uint32_t *data = xmmap(...);
>> >
>> > then I think we would have to change the case where ret is non-zero to be:
>> >
>> >     if (data)
>> >         munmap((void*)data, ...);
>> >
>> > and likewise, data_p is const.
>>
>> Doing it that way sounds great to me.  That way, the type contains the
>> information we need up-front and the safety of the cast is obvious in
>> the place where the cast is needed.
>>
>> (Although my understanding is also that in C it's fine to pass a
>> uint32_t* to a function expecting a void*, so the second cast would
>> also not be needed.)

I do not think c99 allows this in 100% of cases - specifically if there a const void * involved. gcc does not care. I do not think c89 cares either. I will watch out for it when this is merged.
--Randall




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux