Re: [PATCH v2 00/16] First class shallow clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: "Duy Nguyen" <pclouds@xxxxxxxxx>
Sent: Tuesday, July 23, 2013 2:20 AM
On Tue, Jul 23, 2013 at 6:41 AM, Philip Oakley <philipoakley@xxxxxxx> wrote:
From: "Nguyễn Thái Ngọc Duy" <pclouds@xxxxxxxxx>
Subject: [PATCH v2 00/16] First class shallow clone

It's nice to see that shallow can be a first class clone.

Thinking outside the box, does this infrastructure offer the opportunity to
maybe add a date based depth option that would establish the shallow
watermark based on date rather than count. (e.g. the "deepen" SP depth could

I've been carefully avoiding the deepen issues because, as you see,
it's complicated. But no, this series does not enable or disable new
deeepen mechanisms. They can always be added as protocol extensions.
Still thinking if it's worth exposing a (restricted form of) rev-list
to the protocol..

Interesting idea.

have an alternate with a leading 'T' to indicate a time limit ratherv than revision count - I'm expecting such a format would be an error for existing
servers).

My other thought was this style of cut limit list may also allow a big file limit to do a similar process of listing objects (e.g. blobs) that are size-shallow in the repo, though it maybe a long list on some repos, or with
a small size limit.

This one, on the other hand, changes the "shape" of the repo (now with
holes) and might need to go through the same process we do with this
series. Maybe we should prepare for it now. Do you have a use case for
size-based filtering? What can we do with a repo with some arbitrary
blobs missing? Another form of this is narrow clone, where we cut by
paths, not by blob size. Narrow clone sounds more useful to me because
it's easier to control what we leave out.

In some sense a project with a sub-module is a narrow clone, split at a 'commit' object. There have been comments on the git-user list about the problem of accidental adding of large files which then make the repo's foot print pretty large as one use case [Git is consuming very much RAM]. The bigFileThreshold being one way of spotting such files as separate objects, and 'trimming' them.

It doesn't feel right to 'track files and directories` as paths for doing a narrow clone - it'd probably fall into the same trap as tracking file renames. However if one tracks trees and blobs (as a list of sha1 values, possibly with their source path) then it should it should be possible to allow work on the repo with those empty directories/files in the same manner as is used for sub-modules, possibly with some form of git-link file as an alternate marker.

The thought process is to map sub-module working onto the other object types (blobs and trees). The user would be unable to edit the trimmed files/directories anyway, so its sha1 value can't change, allowing it to be included in the next commit in the branch series.

Philip

--
Duy
--

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]