Re: [PATCH 03/12] shallow.c: use commit-slab for commit depth instead of commit->util

Duy Nguyen <pclouds@xxxxxxxxx> · Sat, 12 May 2018 14:09:18 +0200

On Sat, May 12, 2018 at 11:18 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Sat, May 12, 2018 at 05:07:48AM -0400, Jeff King wrote:
>
>> So no, it wouldn't work to directly store depths with the code as
>> written.  I'm not sure if the depth can ever be 0. If not, then it would
>> be a suitable sentinel as:
>>
>>   int *slot = commit_depth_at(&depths, p->item);
>>   if (!*slot || cur_depth < *slot)
>>       *slot = cur_depth;
>>
>> But somebody would have to dig into the possible values of cur_depth
>> there (which would make sense to do as a separate patch anyway, since
>> the point of this is to be a direct conversion).
>
> By the way, one other approach if xcalloc() doesn't produce a good
> sentinel is to use a data type that does. ;) E.g., something like this
> should work:
>
>   struct depth {
>         unsigned valid:1;
>         int value;
>   };
>   define_commit_slab(commit_depth, struct depth);
>
>   ...
>
>   struct depth *slot = commit_depth_at(&depths, p->item);
>   if (!slot->valid || cur_depth < slot->value) {
>         slot->value = cur_depth;
>         slot->valid = 1;
>   }
>
> That wastes an extra 4 bytes per slot over storing an int directly, but
> it's the same as storing an 8-byte pointer, plus you avoid the storage
> and runtime overhead of malloc.

Or we could have a way to let the user decide initial values. If the
initial value here is -1 (which can't possibly be used in the current
code), it could be the sentinel value.

Did you notice the for loop in the end to free "int *"? I don't like
peeking inside a slab that way and would prefer passing a "free"
function pointer to clear_commit_depth(), or just assign a "free"
function to some new field in struct commit_depth and
clear_commit_depth() will call it. If we have a new field for "free"
callback in the struct, it makes sense to have an "init" callback to
do extra initialization on top of xcalloc.

> I actually wonder if we could wrap commit_slab with a variant that
> stores the sentinel data itself, to make this pattern easier. I.e.,
> something like:
>
>   #define define_commit_slab_sentinel(name, type) \
>         struct name##_value { \
>                 unsigned valid:1; \
>                 type value; \
>         }; \
>         define_commit_slab(name, struct name##_value)
>
> and some matching "peek" and "at" functions to manipulate value
> directly.

If you keep this valid bit in a separate slab, you can pack these bits
very tight and not worry about wasting memory. Lookup in commit-slab
is cheap enough that doing double lookups (one for valid field, one
for value) is not a problem, I think.

> I think the end result would be nicer, but it's turning into a little
> bit of a rabbit hole. So I don't mind going with your direct conversion
> here for now.

Yeah I think I will stick with a faithful conversion for now. The
conversion shows room for improvement which could be the next
microproject (I thought of adding this removing 'util' task as a 2019
microproject but it was too tricky for newcomers to do).
-- 
Duy