On Wed, Jan 6, 2016 at 2:26 PM, Eric Curtin <ericcurtin17@xxxxxxxxx> wrote: > > Often I do a standard git clone: > > git clone (name of repo) > > Followed by a depth=1 clone in parallel, so I can get building and > working with the code asap: > > git clone --depth=1 (name of repo) > > Could we change the default behavior of git so that we initially get > all the current files quickly so that we can start working them and > then getting the rest of the data? At least a user could get to work > quicker this way. Any disadvantages of this approach? It would put more burden on a shared and limited resource (i.e. the server side). For example, I just tried a depth=1 clone of Linus's repository from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git which transferred ~150MB pack data to check out 52k files in 90 seconds. On the other hand, a full clone transferred ~980MB pack data and it took 170 seconds to complete. You can already see that a full clone is highly optimized--it does not take even twice the time of getting the most recent checkout to grab 10 years worth of development (562k of commits). This efficiency comes from some tradeoffs, and one of them is that not all the data necessary to check out the latest tree contents can be stored near the beginning of the pack data. So "we'll checkout the tip while the remainder of the data is still incoming" would not be a workable, unless you are willing to destroy the full-clone performance. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html