Re: [PATCH 06/15] run-job: auto-size or use custom pack-files batch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Derrick,

I have been reviewing these jobs' mechanics closely and have some questions:

> The dynamic default size is computed with this idea in mind for
> a client repository that was cloned from a very large remote: there
> is likely one "big" pack-file that was created at clone time. Thus,
> do not try repacking it as it is likely packed efficiently by the
> server. Instead, try packing the other pack-files into a single
> pack-file.
>
> The size is then computed as follows:
>
> batch size = total size - max pack size

Could you please elaborate why is this the best value?
In practice I have been testing this out with the following

> % cat debug.sh
> #!/bin/bash
>
> temp=$(du -cb .git/objects/pack/*.pack)
>
> total_size=$(echo "$temp" | grep total | awk '{print $1}')
> echo total_size
> echo $total_size
>
> biggest_pack=$(echo "$temp" | sort -n | tail -2 | head -1 | awk '{print $1}')
> echo biggest pack
> echo $biggest_pack
>
> batch_size=$(expr $total_size - $biggest_pack)
> echo batch size
> echo $batch_size

If you were to run

> git multi-pack-index repack --batch-size=$(./debug.sh | tail -1)

then nothing would be repack.

Instead, I have had a lot more success with the following

> # Get the 2nd biggest pack size (in bytes) + 1
> $(du -b .git/objects/pack/*pack | sort -n | tail -2 | head -1 | awk '{print $1}') + 1

I think you also used this approach in t5319 when you used the 3rd
biggest pack size

> test_expect_success 'repack creates a new pack' '
> (
> cd dup &&
> ls .git/objects/pack/*idx >idx-list &&
> test_line_count = 5 idx-list &&
> THIRD_SMALLEST_SIZE=$(test-tool path-utils file-size .git/objects/pack/*pack | sort -n | head -n 3 | tail -n 1) &&
> BATCH_SIZE=$(($THIRD_SMALLEST_SIZE + 1)) &&
> git multi-pack-index repack --batch-size=$BATCH_SIZE &&
> ls .git/objects/pack/*idx >idx-list &&
> test_line_count = 6 idx-list &&
> test-tool read-midx .git/objects | grep idx >midx-list &&
> test_line_count = 6 midx-list
> )
> '

Looking forward to a re-roll of this RFC.

Cheers,
Son Luong.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux