Re: Creating objects manually and repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/4/06, Rogan Dawes <discard@xxxxxxxxxxxx> wrote:
Jon Smirl wrote:
> On 8/4/06, Linus Torvalds <torvalds@xxxxxxxx> wrote:
>> I'd suggest against it, but you can (and should) just repack often enough
>> that you shouldn't ever have gigabytes of objects "in flight". I'd have
>> expected that with a repack every few ten thousand files, and most files
>> being on the order of a few kB, you'd have been more than ok, but
>> especially if you have large files, you may want to make things "every
>> <n>
>> bytes" rather than "every <n> files".
>
> How about forking off a pack-objects and handing it one file name at a
> time over a pipe. When I hand it the next file name I delete the first
> file. Does pack-objects make multiple passes over the files? This
> model would let me hand it all 1M files.
>

I'd imagine that this would not necessarily save you a lot, if you have
to write it to disk, and then read it back again. Your only chance here
is if you stay in the buffer, and avoid actually writing to disk at all.

If I keep creating files, reading them and then deleting them then it
is likely that the same blocks are being used over and over. Since the
blocks are reused it will stop the cache thrashing. Some disk writes
will still happen but that is way better than doing 12GB of unique
writes followed by 12GB of reads. The 24GB of IO is all reads on small
files so it is seek time limited since repack does writes in the
middle of the reads.

Of course, using a ramdisk/tmpfs for your object directories might be
enough to save you. Just use a symlink to tmpfs for the objects
directory, and leave the pack files on persistent storage.

The unpacked set of objects is way to big to fit into RAM. Any scheme
using the unpacked objects will spill to disk.

That doesn't answer your question about how many passes pack-objects
does. Nicholas Pitre should be able to answer that.

Rogan



--
Jon Smirl
jonsmirl@xxxxxxxxx
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]