Re: Creating objects manually and repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 4 Aug 2006, Jon Smirl wrote:
> 
> How about forking off a pack-objects and handing it one file name at a
> time over a pipe. When I hand it the next file name I delete the first
> file. Does pack-objects make multiple passes over the files? This
> model would let me hand it all 1M files.

pack-objects does actually make several (well, two) passes over the 
objects right now, because it first does all the sorting based on object 
size/type, and then does the actual deltifying pass. 

But doing things one file-name at a time would certainly be fine. You can 
even do it with git-pack-objects running in parallel, ie you can do a

	for_each_filename() {
		cvs-generate-objects filename | git-pack-objects filename
		rm -rf .git/objects/??/
	}

and then "cvs-generate-objects" should just make sure that it writes the 
git object _before_ it actually outputs the object name on stdout.

And if you do it this way, you won't even have to pass any filenames, 
since git-pack-objects will only get objects for the same file, and will 
do the right thing just sorting them by size.

So in the above kind of setting, the _only_ thing that 
cvs-generate-objects needs to do is:

	for_each_rev(file) {
		unsigned char sha1[20];
		unsigned long len;
		void *buf;

		/* unpack the revision into memory */
		buf = cvs_unpack_revision(&len);

		/* Write it out as a git blob file */
		write_sha1_file(buf, len, "blob", sha1);

		/* Free the memory image */
		free(buf);

		/* Tell git-pack-objects the name of the git blob */
		printf("%s\n", sha1_to_hex(sha1));
	}

and you're basically all done. The above would turn each *,v file into a 
*-<sha>.pack/*-<sha>.idx file pair, so you'd have exactly as many 
pack-files as you have *,v files.

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]