Re: Using trees for metatagging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday 18 February 2010, martin f krafft wrote:
> Git's object store uses trees mainly to represent a hierarchical
> filesystem. It occurs to me that you could layer additional
> hierarchies on top — specifically, you could use it to track subsets
> of files, i.e. "tagging".
>
> For instance you want some sort of representation for "the set of
> files that need review". You /could/ create a new tree and reference
> all files in that set as children. Now if you wanted to find out
> what to review, you'd list the children of this tree. After
> reviewing a file, you write a new tree with the set less that file's
> ref.. Obviously, if you made changes to the file, it should be
> reconnected to all other trees that referenced it.
>
> I have a couple of questions about this:
>
> 1. Does Git provide plumbing for me to find out which trees
>    reference a given blob? If not, I will have to iterate all trees
>    and record which ones have a given message as a child.
>
> 2. Is there a way you can fathom by which unlinking a blob from the
>    main hierarchy also causes it to be unlinked from this meta tree
>    I am speaking of as well? Similarly, if a blob is rewritten, how
>    could I make sure it replaces the old blob in all referencing
>    trees?
>
> 3. Am I right in assuming that I'd have to track a completely
>    seperate ancestry for this tree, that is create e.g. a commit
>    object, point refs/metatrees/mytree to it, and reference the tree
>    from the commit?
>
> 4. Since this hierarchy is not really to be mapped into the
>    filesystem, how would one resolve conflicts when merging
>    ancestries? Of course it would be nice if I could check out this
>    meta tree into the filesystem, make changes, and be assured that
>    new blobs replace old blobs in other referencing trees, as per
>    (2.), but that's a pipedream maybe.
>
> 5. Do you know of similar efforts? Are there must-reads out there,
>    apart from the design of Git?

Take a look at the (relatively) new notes feature. (See the jh/notes 
series in 'pu' and various recent discussions on this mailing list.) 
Git notes probably won't satisfy the exact requirements you list above, 
but it _does_ tackle some parallel issues (e.g. how to maintain a tree 
that is not checked out, storing metadata associated with Git objects, 
etc.). If you take a step back and reconsider your original problem, 
you might find that it's solvable by using commit notes.

For example, you could add a simple note to each blob that has been 
reviewed, on the refs/notes/reviewed notes ref. You could then write a 
simple script (using "git notes list") that lists all blobs (i.e. 
files) without a corresponding note in refs/notes/reviewed.


...Johan

-- 
Johan Herland, <johan@xxxxxxxxxxx>
www.herland.net
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]