Re: Continue git clone after interruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 18 Aug 2009, Nicolas Pitre wrote:
> On Tue, 18 Aug 2009, Jakub Narebski wrote:
> 
>> You can probably get number and size taken by delta and non-delta (base)
>> objects in the packfile somehow.  Neither "git verify-pack -v <packfile>"
>> nor contrib/stats/packinfo.pl did help me arrive at this data.
> 
> Documentation for verify-pack says:
> 
> |When specifying the -v option the format used is:
> |
> |        SHA1 type size size-in-pack-file offset-in-packfile
> |
> |for objects that are not deltified in the pack, and
> |
> |        SHA1 type size size-in-packfile offset-in-packfile depth base-SHA1
> |
> |for objects that are deltified.
> 
> So a simple script should be able to give you the answer.

Thanks.

There are 114937 objects in this packfile, including 56249 objects
used as base (can be deltified or not).  git-verify-pack -v shows
that all objects have total size-in-packfile of 33 MB (which agrees
with packfile size of 33 MB), with 17 MB size-in-packfile taken by
deltaified objects, and 16 MB taken by base objects.

  git verify-pack -v | 
    grep -v "^chain" | 
    grep -v "objects/pack/pack-" > verify-pack.out

  sum=0; bsum=0; dsum=0; 
  while read sha1 type size packsize off depth base; do
    echo "$sha1" >> verify-pack.sha1.out
    sum=$(( $sum + $packsize ))
    if [ -n "$base" ]; then 
       echo "$sha1" >> verify-pack.delta.out
       dsum=$(( $dsum + $packsize ))
    else
       echo "$sha1" >> verify-pack.base.out
       bsum=$(( $bsum + $packsize ))
    fi
  done < verify-pack.out
  echo "sum=$sum; bsum=$bsum; dsum=$dsum"
 
>>>> (BTW what happens if this pack is larger than file size limit for 
>>>> given filesystem?).
[...]

>> If I remember correctly FAT28^W FAT32 has maximum file size of 2 GB.
>> FAT is often used on SSD, on USB drive.  Although if you have  2 GB
>> packfile, you are doing something wrong, or UGFWIINI (Using Git For
>> What It Is Not Intended).
> 
> Hopefully you're not performing a 'git clone' off of a FAT filesystem.  
> For physical transport you may repack with the appropriate switches.

Not off a FAT filesystem, but into a FAT filesystem.
 
[...]

>>> I think it is better to "prime" the repository with the content of the 
>>> top commit in the most straight forward manner using git-archive which 
>>> has the potential to be fully restartable at any point with little 
>>> complexity on the server side.
>> 
>> But didn't it make fully restartable 2.5 MB part out of 37 MB packfile?
> 
> The front of the pack is the critical point.  If you get enough to 
> create the top commit then further transfers can be done incrementally 
> with only the deltas between each commits.

How?  You have some objects that can be used as base; how to tell 
git-daemon that we have them (but not theirs prerequisites), and how
to generate incrementals?

>> A question about pack protocol negotiation.  If clients presents some
>> objects as "have", server can and does assume that client has all 
>> prerequisites for such objects, e.g. for tree objects that it has
>> all objects for files and directories inside tree; for commit it means
>> all ancestors and all objects in snapshot (have top tree, and its 
>> prerequisites).  Do I understand this correctly?
> 
> That works only for commits.

Hmmmm... how do you intent for "prefetch top objects restartable-y first"
to work, then?
 
>> BTW. because of compression it might be more difficult to resume 
>> archive creation in the middle, I think...
> 
> Why so?  the tar+gzip format is streamable.

gzip format uses sliding window in compression.  "cat a b | gzip"
is different from "cat <(gzip a) <(gzip b)".

But that doesn't matter.  If we are interrupted in the middle, we can
uncompress what we have to check how far did we get, and tell server
to send the rest; this way server wouldn't have to even generate 
(but not send) what we get as partial transfer.

P.S. What do you think about 'bundle' capability extension mentioned
     in a side sub-thread?
-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]