Re: read-only working copies using links

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 24, 2009 at 10:39:46AM -0800, Chad Dombrova wrote:

>> I think Tim Ansell (cced) was talking about this at the gittogether
>> (storing the metadata seperately), as it would benefit sparse/narrow
>> checkout, another advantage supporting his case?
>
> what's the case against it, other than the obvious, that it will take  
> more work?

I'm not sure this is actually the same as Tim's proposal. Tim wanted to
store the commit and tree information separately from the blob
information (since his use case was that blobs are enormous, but the
rest is reasonable).

AIUI, Chad's proposal is about storing the actual blob data itself
separate from the blob object's metadata (i.e., its object type and
length headers). Which means that the normal loose object format is not
acceptable, and you would end up with something like (for example):

  .git/objects/pack/pack-full-of-your-regular-stuff.{pack,idx}
  .git/objects/[0-9a-f]{2}/[0-9a-f]{38}/header
  .git/objects/[0-9a-f]{2}/[0-9a-f]{38}/data

or something similar. Then you could hardlink directly to the 'data'
portion. So you would need:

  - to teach everything that ever looks for loose objects how to read
    this new format. In theory, it's all nicely encapsulated in
    sha1_file.c

  - to teach checkout routines to hardlink such a case instead of
    copying the file

The obvious downsides that I can think of are:

  - it has the potential to make object reading, which is a core part of
    git (read: very performance- and correctness- sensitive) a lot more
    complex. But maybe the implementation would not be that painful;
    somebody would have to look very closely to see.

  - it interacts badly with smudge/clean filters and crlf conversion.
    In those cases you can't hardlink. If you treat this like an
    optimization, though, it's not so bad: we only do the optimization
    when we _can_, and fall back to regular checkout if those other
    options are in effect.

  - it's somewhat dangerous to your repository's health. Git's model is
    that object files are immutable (since they are, after all, named
    after their contents). But now you are linking them into your
    working tree, which makes them susceptible to some third party tool
    munging them. So yes, most tools will probably behave, but any tool
    that misbehaves will actually corrupt your repository.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux