Re: [PATCH v2 7/8] repack: implement `--filter-to` for storing filtered out objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 5, 2023 at 8:26 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Christian Couder <christian.couder@xxxxxxxxx> writes:
>
> > A previous commit has implemented `git repack --filter=<filter-spec>` to
> > allow users to filter out some objects from the main pack and move them
> > into a new different pack.
>
> OK, this sidesteps the question I had on an earlier step rather
> nicely.  Instead of having to find out which ones are to be moved
> away, just generating them in a separate location would be more
> straight forward.
>
> The implementation does not seem to restrict where --filter-to
> directory can be placed, but shouldn't it make sure that it is one
> of the already specified alternates directories?  Otherwise the user
> will end up corrupting the repository, no?

I don't think it should make sure that the implementation should
restrict where the --filter-to directory can be placed.

In version 3, that I just sent, I have written the following in the
commit message to explain this:

"
   Even in a different directory, this pack can be accessible if, for
   example, the Git alternates mechanism is used to point to it. In fact
   not using the Git alternates mechanism can corrupt a repo as the
   generated pack containing the filtered objects might not be accessible
   from the repo any more. So setting up the Git alternates mechanism
   should be done before using this feature if the user wants the repo to
   be fully usable while this feature is used.

   In some cases, like when a repo has just been cloned or when there is no
   other activity in the repo, it's Ok to setup the Git alternates
   mechanism afterwards though. It's also Ok to just inspect the generated
   packfile containing the filtered objects and then just move it into the
   '.git/objects/pack/' directory manually. That's why it's not necessary
   for this command to check that the Git alternates mechanism has been
   already setup.
"

I haven't mentioned cases related to promisor remotes, but I think in
some of those cases the feature can be very useful too while there is
no need to check that the Git alternates mechanism has been set up.

In version 3, the doc for the --filter-to option and the corresponding
gc.repackFilterTo config flag look like this:

+--filter-to=<dir>::
+       Write the pack containing filtered out objects to the
+       directory `<dir>`. Only useful with `--filter`. This can be
+       used for putting the pack on a separate object directory that
+       is accessed through the Git alternates mechanism. **WARNING:**
+       If the packfile containing the filtered out objects is not
+       accessible, the repo could be considered corrupt by Git as it
+       migh not be able to access the objects in that packfile. See
+       the `objects` and `objects/info/alternates` sections of
+       linkgit:gitrepository-layout[5].

+gc.repackFilterTo::
+       When repacking and using a filter, see `gc.repackFilter`, the
+       specified location will be used to create the packfile
+       containing the filtered out objects. **WARNING:** The
+       specified location should be accessible, using for example the
+       Git alternates mechanism, otherwise the repo could be
+       considered corrupt by Git as it might not be able to access the
+       objects in that packfile. See the `--filter-to=<dir>` option
+       of linkgit:git-repack[1] and the `objects/info/alternates`
+       section of linkgit:gitrepository-layout[5].

So they warn about possible issues with the feature and link to some
relevant doc.

Now if we think that it's not enough, I would implement a check in the
code that would warn users loudly if the directory specified by those
options is not accessible using the Git alternates mechanism. It would
be annoying I think that it would be too restrictive to error out in
that case though.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux