getting list of objects for packing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm trying to write a script that will repack large binary or compressed
objects into their own non-compressed, non-delta'ed pack file.

To make the decision about whether an object should go into this special
pack file or not, I want the output from 'git cat-file --batch-check'.
I get it with something similar to:

   git rev-list --objects --all |
      sed -e 's/^\([0-9a-f]\{40\}\).*/\1/' |
      git cat-file --batch-check

First question: Is the rev-list call correct?
  -If I am understanding things right, then the list of objects produced
   by rev-list will be in the right order for piping to pack-objects. 
  -The sed statement is stripping off anything after the sha1. Any way to
   get rev-list to print out just the sha1 so that sed is not necessary?

Then I want to parse the output from cat-file and use an external program
to detect the file format. Here is a simplified version:

  | while read sha1 type size; do

       if [ $type = "blob" ]; then
           if ! ( git cat-file blob "$sha1" | file -b - | grep text ) &&
              [ $size -ge $threshhold ]; then
               # pack into special pack
           else
               # pack normally into normal pack
           fi
       fi
  done

All of this has actually been rewritten into a perl script, so ignore any
syntax mistakes.

I have successfully created two of the pack files that I have been trying to
make. Where the definition of successful means that after removing the existing
packs and objects, and putting in place the two pack files that I generated,
'git fsck --full' prints no errors and exits successfully.

These two packs will be placed into a central repository.

ISSUE TWO:

I have placed these two packs into my own personal repo, and I have unpacked all
of the other objects so that they are loose.

I thought I could use a similar sequence of commands to pack those loose objects
into a normal and special pack. I added the --unpacked option to my rev-list
command, but it still lists many more objects than exist loosely in the repository.

   git rev-list --objects --unpacked --all

The man page says:

   --objects
          Print  the  object  IDs  of any object referenced by the listed
          commits. --objects foo ^bar thus means "send me all object  IDs
          which  I  need to download if I have the commit object bar, but
          not foo".

   --unpacked
          Only useful with --objects; print the object IDs that  are  not
          in packs.

Is this the correct behavior for rev-list --unpacked?
Am I mis-reading the --unpacked text, or should it be changed?

-brandon

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux