Re: [RFC v3] cat-file: add a --stdin-cmd mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 31 2022, Junio C Hamano wrote:

> Bagas Sanjaya <bagasdotme@xxxxxxxxx> writes:
>
>> On 29/01/22 01.33, John Cai wrote:
>>> Future improvements:
>>> - a non-trivial part of "cat-file --batch" time is spent
>>> on parsing its argument and seeing if it's a revision, ref etc. So we
>>> could add a command that only accepts a full-length 40
>>> character SHA-1.
>>
>> I think the full hash is actually revision name.
>
> There is no entry for "revision name" in Documentation/glossary-content.txt
> ;-)
>
> But to John, if you have a loop that feedseach line to get_oid(), 
>
> 	while (getline(buf)) {
> 		struct object_id oid;
> 		if (get_oid(buf, &oid))
> 			warn and continue;
> 		use oid;
> 	}
>
> is it much slower than a mode that can ONLY handle a full object
> name input, i.e.
>
> 	while (getline(buf)) {
> 		struct object_id oid;
> 		if (get_oid_hex(buf, &oid))
> 			warn and continue;
> 		use oid;
> 	}
>
> when your input is restricted to full object names?
>
> get_oid() == repo_get_oid()
> -> get_oid_with_context()
>    -> get_oid_with_context_1()
>       -> get_oid_1()
> 	 -> peel_onion()
> 	 -> get_oid_basic()
> 	    -> get_oid_hex()
> 	    -> repo_dwim_ref()
>
> it seems that warn_ambiguous_refs and warn_on_object_refname_ambiguity
> we would waste time on refname discovery but I see cat-file already
> has some provision to disable this check.  So when we do not need to
> call repo_dwim_ref(), do we still spend measurable cycles in this
> callchain?

For what it's worth I think this claim that we spend a non-trivial
amount of time on the difference between these two comes from me
originally. I'd had a chat with John about various things to try out in
such a "cat-file --batch" mode, and this was one of those things.

I tried instrumenting the relevant code in builtin/cat-file.c the other
(but forgot to reply to this thread), and whatever I'd found there at
the time (this was weeks/months ago) I couldn't reproduce.

So there's probably nothing worthwhile to check out here, i.e. the
trivial cost of get_oid_with_context() is probably nothing to worry
about.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux