Re: [PATCH] fast-import: replace custom hash with hashmap.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 06.04.20 um 20:35 schrieb Jeff King:
> On Mon, Apr 06, 2020 at 08:07:45PM +0200, René Scharfe wrote:
>
>>> Interesting. I re-ran mine just to double check, and got:
>> [...]
>> A second run today reported:
>>
>> Test                        origin/master         HEAD
>> ---------------------------------------------------------------------------
>> 9300.1: export (no-blobs)   61.58(59.93+1.40)     60.63(59.35+1.22) -1.5%
>> 9300.2: import (no-blobs)   239.64(239.00+0.63)   246.02(245.18+0.82) +2.7%
>>
>> git describe says v5.4-3890-g86fd3e9df543 in that repo.
>
> I'm on v5.6-rc7-188-g1b649e0bcae7, but I can't imagine it makes a big
> difference.
>
>> Dunno.  My PC has thermal issues and stressing it for half an hour straight
>> may cause it to throttle?
>
> Yeah. I wonder what the variance is between the 3 runs (assuming you're
> using ./run and doing 3). I.e., is one run in the first set much faster
> than the others, and we pick it as best-of-3.

I did use run with the default three laps.  Ran p9300 directly with
patch v2 applied instead now, got this:

Test                        this tree
-----------------------------------------------
9300.1: export (no-blobs)   64.80(60.95+1.47)
9300.2: import (no-blobs)   210.00(206.02+2.04)
9300.1: export (no-blobs)   62.58(60.74+1.35)
9300.2: import (no-blobs)   209.07(206.78+1.96)
9300.1: export (no-blobs)   61.33(60.17+1.03)
9300.2: import (no-blobs)   207.73(205.47+2.04)

> But we
> really do lie to container_of(). See remote.c, for example. In
> make_remote(), we call hashmap_get() with a pointer to lookup_entry,
> which is a bare "struct hashmap_entry". That should end up in
> remotes_hash_cmp(), which unconditionally computes a pointer to a
> "struct remote".

Ugh.  Any optimization level would delay that to after the keydata
check, though, I guess.

There's also function pointer casting going on (with patch_util_cmp()
and sequence_entry_cmp()), which is undefined behavior IIUC.

> Now the hashmap_entry there is at the beginning of the struct, so the
> offset is 0 and the two pointers are the same. So while the pointer's
> type is incorrect, we didn't compute an invalid pointer value. And
> traditionally the hashmap code required that it be at the front of the
> containing struct, but that was loosened recently (and container_of
> introduced) in 5efabc7ed9 (Merge branch 'ew/hashmap', 2019-10-15).

Hmm, OK.

> And grepping around, I don't see any cases where it's _not_ at the
> beginning. So perhaps this is a problem waiting to bite us.

Possibly.

Speaking of grep, I ended up with this oneliner to show all the
comparison functions used with hashmaps:

git grep -W -f <(git grep -wh hashmap_init | cut -f2 -d, | grep -v -e ' .* ' -e NULL -e '^ *$' | sed 's/ //; s/.*)//; s/^/int /' | sort -u)

But I didn't learn much from looking at them, except that most have
that unconditional container_of.

>> Stuffing the oidhash() result into a variable and using it twice with
>> hashmap_entry_init() would work as well.  This would make the reason for
>> the duplicate find_object() code obvious, while keeping struct
>> hashmap_entry opaque.
>
> I'd prefer not to use a separate variable, as that requires giving it a
> type.

Specifying the type that oidhash() returns and hashmap_entry_init()
accepts is redundant and inconvenient, but that line should be easy to
find and adapt when the function signatures are eventually changed.
Which is perhaps never going to happen anyway?

> Perhaps:
>
>   hashmap_entry_init(&e->ent, lookup_entry.hash);
>
> which is used elsewhere is OK? That still assumes there's a hash field,
> but I think hashmap has always been pretty up front about that (the real
> sin in the original is assuming that nothing else needs to be
> initialized).

That conflicts with that statement from the hashmap.h:

 * struct hashmap_entry is an opaque structure representing an entry in the
 * hash table.

Patch v2 looks good to me, though!

René




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux