Re: metastore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 3 Oct 2007, Julian Phillips wrote:

Subject: Re: metastore

On Tue, 2 Oct 2007, David Härdeman wrote:

On Tue, Oct 02, 2007 at 10:04:56PM +0200, David Kastrup wrote:
David Härdeman <david@xxxxxxxxxxx> writes:

>  On Tue, Oct 02, 2007 at 08:53:01PM +0100, martin f krafft wrote:
> > also sprach David Härdeman <david@xxxxxxxxxxx> [2007.09.19.2016 +0100]:
> > >  But I agree, if any changes were made to git, I'd advocate adding
> > >  arbitrary attributes to files (much like xattrs) in name=value
> > >  pairs, then any extended metadata could be stored in those
> > >  attributes and external scripts/tools could use them in some way
> > >  that makes sense...and also make sure to only update them when it
> > >  makes sense.
> > > > So where would those metdata be stored in your opinion?
> >  I'm not sufficiently versed in the internals of git to have an
>  informed opinion :)

I think we have something like a length count for file names in index
and/or tree.  We could just put the (sorted) attributes after a NUL
byte in the file name and include them in the count.  It would also
make those artificially longer file names work more or less when
sorting them for deltification.

Or perhaps the index format could be extended to include a new field for value=name pairs instead of overloading the name field.

But as I said, I have no idea how feasible it would be to change git to support another arbitrary length field in the index/tree file.

However, this requires implementing _policies_: it must be possible to
specify per repository exactly what will and what won't get tracked,
or one will get conflicts that are not necessary or appropriate.

I think the opposite approach would be better. Let git provide set/get/delete attribute operations and leave it at that. Then external programs can do what they want with that data and add/remove/modify tags as necessary (and also include the smarts to not, e.g. remove the permissions on all files if the git repo is checked out to a FAT fs).

You need more than that. You need to be able to log, blame etc on the attributes. One of the big annoyances of Subversion properties is being unable to find out when or why a property value was changed.

I still don't see why the attributes need to be stored in git directly - particularly if you are going to use an external program to actually apply any settings - why not store the attributes as normal file (or files) of some sort tracked by git? You could use any number of methods - e.g. use an sqlite database stored in the root of your tree, or a .<name>.props file alongside each path that you have properties for. You could even write a system that uses such a method and was then SCM agnostic, allowing you to keep your attribute tracking system if/when something better than git comes along - or simply share it with less-fortunate souls stuck in an inferior system.

one other big advantage of keeping things in a normal file, it's easier to get the results accepted into git!

don't forget that the core git maintainers don't really see this as a worthwhile effort, so the more intrusive the result is the less likely it is to be accepted. It may end up that storing the attributes inside of git _is_ the best thing to do, but it's gong to be a whole lot easier to get a patch to implement this accepted if it's a migration from an existing, heavily used, implementation then if it's from the 'outside' with people saying "this is a neat thing, we think people would use it if it only had this"

and even if an internal implementation does end up being the right thing, the exact shape of the API is an item that will require a lot of debate (and probably a few false starts) to get right. let's figure out the real-world useage patterns first, and then work from there as appropriate.

shifting back onto implementaion details

in the discussion a few weeks ago I was told that there is a way to look at the contents of a file that hasn't been checked out yet (somehow it exists in a useable form 'in the index') but when I asked for information about how to do this I never got a response.

the reason for needing this is that the routines writing the files need to be able to access this information when they are dong so, but that file may not be checked out.

for that matter, .gitattributes should have a similar problem (if .gitattibutes for a directory hasn't been checked out yet how do you know if you could do the line ending conversions on a file or not?). how is the problem addressed there? (or is it the case that all the use so far has really not used the per-directory files and everything is in the master file, and that doesn't change enough to find these problems?

David Lang

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux