Re: [PATCH] cherry: cache patch-ids to avoid repeating work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:

> Okay, it seems like I never have time to review this, so I'll just 
> take a few minutes to comment on some aspects:
>
>> @@ -1094,6 +1104,8 @@ int cmd_cherry(int argc, const char **argv,
>> const char *prefix)
>>  	const char *limit = NULL;
>>  	int verbose = 0;
>> 
>> +	git_config(git_cherry_config, NULL);
>> +
>>  	if (argc > 1 && !strcmp(argv[1], "-v")) {
>>  		verbose = 1;
>>  		argc--;
>
> Is this really purely for cherry, and not at all for "log --cherry-pick"?  
> Maybe it should be "cache.patchIds" to begin with.

What other things would we want caches for?

As a general rule, I'd prefer keeping these unproven new features opt
in (i.e. default to false unless explicitly asked for).

>> +union cached_sha1_map_header {
>> +	struct {
>> +		char signature[4]; /* CS1M */
>> +		uint32_t version;
>> +		uint32_t count;
>> +		uint32_t size;
>> +		uint32_t pad; /* pad to 20 bytes */
>> +	} u;
>> +	/* pad header out to 40 bytes.  As a consistency
>> +	 * check, pad.value stores the sha1 of pad.key. */
>> +	struct cached_sha1_entry pad;
>
> Why does it have to be a union?

Hmm.  I think you are right.

	struct cached_sha1_map_header {
        	char signature[4];
                uint32_t version;
                uint32_t count;
                uint32_t size;
                uint32_t unused;
		unsigned char csum[20];
	};

would equally be good, as long as we assume the struct is naturally
packed.  I do agree with you that it may not worth checking only the
header, though. 

>> +static const char *signature = "CS1M";
>
> Carrie Scr*ws 1 Man?

No Idea ;-)

>> +	cache->mmapped = 0;
>> +	cache->dirty = 1;
>
> Is it already dirty?  I don't think so.

This flag is more about "do we need to write it back to file", and when it
starts out without reading from an existing file, we always need to as
long as the table contains something at the end of the processing.

You could instead check (!cache->mmapped && cache->count) for that, I
guess.

>> +	cache->entries = calloc(size, sizeof(struct cached_sha1_entry));
>> +	if (!cache->entries) {
>> +		warning("failed to allocate empty map of size %"PRIu32" for %s",
>> +			size, git_path(cache->filename));
>
> xcalloc() to the rescue.

This is purely optional cache and we would want to degrade to operate
without it if any of these fails.  xcalloc() won't let you do so.

> Really, I think that these checks should be _made_ unnecessary, by 
> restricting the size of the cache.  IMO Caching more than 2^10 patch ids 
> (completely made up on the spot) is probably even detrimental, and it 
> might be better to just scratch them all and start with a new cache then.

Probably.  Or fall back on uncached operation.

>> +static int init_cached_sha1_map(struct cached_sha1_map *cache)
>> +{
>>
>> [...]
>>
>> +	SHA1_Init(&ctx);
>> +	SHA1_Update(&ctx, header.pad.key, 20);
>> +	SHA1_Final(header.pad.key, &ctx); /* reuse pad.key to store its sha1 */
>> +	if (hashcmp(header.pad.key, header.pad.value)) {
>> +		warning("%s header has invalid sha1", filename);
>> +		goto empty;
>> +	}
>
> I do not think that it is worth checking that.  If you do not trust your 
> hard disk, you might just as well jump out the window.
>
> Checking just takes too much time.

This is only checking the header, so it won't take much time, but I tend
to doubt the value of this.

>> +	/* mmap entire file so that file / memory blocks are aligned */
>> +	map_size = sizeof(struct cached_sha1_entry) * (cache->size + 1);
>> +	cache->entries = mmap(NULL, map_size,
>> +		PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
>
> AFAIR there were _serious_ performance issues with mmap() on non-Linux 
> platforms.  I chose pread() in my original implementation for a reason.

That is not a reason to punish users on platforms with working mmap(2) ;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux