Re: is hosting a read-mostly git repo on a distributed file system practical?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 13, 2011 at 12:06 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> On Tue, Apr 12, 2011 at 21:40, Jon Seymour <jon.seymour@xxxxxxxxx> wrote:
>> The idea is that most developers would use the DFS-based repo to track
>> the tip of the development stream, but only the integrator would
>> publish updates to the DFS-based repo.
>>
>> As such, the need to repack the DFS-based repo will be somewhat, but
>> not completely, reduced.
>
> Serving git clone is basically a repack operation when run over
> git://, http:// or SSH. If the DFS was mounted as a local filesystem,
> git clone would turn into a cpio to copy the directory contents. I'm
> not sure if that is what you are suggesting to do here or not.
>

All clients, including the client that occasionally updates the
read-mostly repo would be mounting the DFS
as a local file system. My environment is one where DFS is easy, but
establishing a shared server is more complicated (ie. bureaucratic).

I guess I am prepared to put up with a slow initial clone (my
developer pool will be relatively stable and pulling from a
peer via git: or ssh: will usually be acceptable for this occasional need).

What I am most interested in is the incremental performance. Can my
integrator, who occasionally
updates the shared repo, avoid automatically repacking it (and hence
taking the whole of repo latency hit)
and can my developers who are pulling the updates do so reliably
without a whole of repo scan?

>> Is this going to be practical, or are whole of repo operations
>> eventually going to kill me because of latency and bandwidth issues
>> associated with use of the DFS?
>
> Latency is a problem. The Git pack file has decent locality, but there
> are some things that could still stand to be improved. It really
> doesn't work well unless the pack is held completely in the machine's
> memory.

I understand that avoiding repacking for an extended period brings its
own problems, so I guess I could live with a local repack followed by
an rsync transfer to re-initial the shared remote, if this was
warranted.

I agree, there is no substitute for testing this, but experience of
others can be helpful in deciding whether it is even worth attempting.

>
> --
> Shawn.
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]