Re: What is an efficient way to get all blobs / trees that have notes attached?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 4, 2016 at 9:46 AM, Sebastian Schuberth
<sschuberth@xxxxxxxxx> wrote:
> On Fri, Apr 1, 2016 at 2:16 PM, Johan Herland <johan@xxxxxxxxxxx> wrote:
>>> 3) Recursively list all blobs / trees (git-ls-tree) and look whether an
>>> object's hash is conatined in our table to get its notes.
>>>
>>> In particular 3) could be expensive for repos with a lot of files as we're
>>> looking at all of them just to see whether they have notes attached.
>>
>> In (3), why would you need to search through _all_ blobs/trees? Would
>> it not be cheaper to simply query the object type of each annotated
>> object from (2)? I.e. something like:
>>
>> for notes_ref in $(git for-each-ref refs/notes | cut -c 49-)
>> do
>>     echo "--- $notes_ref ---"
>>     for annotated_obj in $(git notes --ref=$notes_ref list | cut -c 41-)
>>     do
>>         type=$(git cat-file -t "$annotated_obj")
>>         if test "$type" != "commit"
>>         then
>>             echo "$annotated_obj: $type"
>>         fi
>>     done
>> done
>
> Thanks for the idea. The problem is that I do want to list the notes
> by path of the object they belong to. As a blob could potentially
> belong to more than one path (copies of files in the repo), I do not
> see another way of getting that information other than iterating over
> all blobs and checking what path(s) they belong to.

True; fundamentally what you want is a blob/tree ID -> path(s) mapping,
which is an independent problem, unrelated to to the initial notes lookup.

I don't know of a solution faster than the brute-force search you already
sketched. If this lookup is important to your use case, you could consider
building/caching the required mapping when the notes are added in the
first place, but I don't know if that is possible in your scenario...


...Johan

> --
> Sebastian Schuberth

-- 
Johan Herland, <johan@xxxxxxxxxxx>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]