Re: Re: [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 22, 2022 at 2:23 AM Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote:
>
> Han Xin <hanxin.hx@xxxxxxxxxxxxx> writes:
> > If a commit is in the commit graph, we would expect the commit to also
> > be present. So we should use has_object() instead of
> > repo_has_object_file(), which will help us avoid getting into an endless
> > loop of lazy fetch.
> >
> > We can see the endless loop issue via this[1].
> >
> > 1. https://lore.kernel.org/git/20220612161707.21807-1-chiyutianyi@xxxxxxxxx/
>
> As described in SubmittingPatches:
>
>   Try to make sure your explanation can be understood
>   without external resources. Instead of giving a URL to a mailing list
>   archive, summarize the relevant points of the discussion.
>

Nod.

> > +test_expect_success 'setup' '
> > +     git init --bare dest.git &&
> > +     test_commit one &&
> > +     git checkout -b tmp &&
> > +     test_commit two &&
> > +     git push dest.git --all
> > +'
>
> You can commit directly to the repo by using "test_commit -C dest.git".
> Also, can the repositories be better named? I see a "dest.git" (which
> seems to contain all the objects), "alternates" (which seems to contain
> everything except refs/heads/tmp), "source" (which only contains the
> commit graph), and the current directory. It would probably be better to
> name them e.g. "with-commit", "without-commit", "only-commit-graph", and
> omit using the current directory altogether.

Yes, it makes sense to me.

>
> > +test_expect_success 'prepare a alternates repository without commit two' '
> > +     git clone --bare dest.git alternates &&
> > +     oid=$(git -C alternates rev-parse refs/heads/tmp) &&
> > +     git -C alternates update-ref -d refs/heads/tmp &&
> > +     git -C alternates gc --prune=now &&
> > +     pack=$(echo alternates/objects/pack/*.pack) &&
> > +     git verify-pack -v "$pack" >have &&
> > +     ! grep "$oid" have
> > +'
>
> OK, except refs/heads/tmp could probably have a better name.

I'll rethink the naming here.

>
> > +test_expect_success 'prepare a repository with a commit-graph contains commit two' '
> > +     git init source &&
> > +     echo "$(pwd)/dest.git/objects" >source/.git/objects/info/alternates &&
> > +     git -C source remote add origin "$(pwd)/dest.git" &&
> > +     git -C source config remote.origin.promisor true &&
> > +     git -C source config remote.origin.partialclonefilter blob:none &&
> > +     # the source repository has the whole refs contains refs/heads/tmp
> > +     git -C source fetch origin &&
> > +     (
> > +             cd source &&
> > +             test_commit three &&
> > +             git -c gc.writeCommitGraph=true gc
> > +     )
> > +'
>
> Is the purpose of the fetch only to add a ref? If yes, it's clearer just
> to create that branch instead of fetching.

Nod.

>
> > +test_expect_success 'change the alternates of source to that without commit two' '
> > +     # now we have a commit-graph in the source repository but without the commit two
> > +     echo "$(pwd)/alternates/objects" >source/.git/objects/info/alternates
> > +'
>
> OK.
>
> > +test_expect_success 'fetch the missing commit' '
> > +     git -C source fetch origin $oid 2>fetch.out &&
> > +     grep "$oid" fetch.out
> > +'
>
> Is the bug triggered by fetching the missing commit or by fetching any
> commit (which triggers the usage of the commit graph)? If any commit,
> then it's clearer to create an arbitrary commit and then fetch it.
>

Yes, using the missing object in the commit-graph seems to be a little
misleading.

> Also, I thought that the issue was an infinite loop, and the thing being
> tested here looks different from that. If you want to ensure that
> nothing is being fetched, you can use GIT_TRACE="$(pwd)/trace" to
> observe a fetch-pack command being invoked or
> GIT_TRACE_PACKET="$(pwd)/trace" to observe the packet being sent. If
> you're worried about an infinite loop, you can set origin to a directory
> that does not exist (so that the fetch immediately fails).

Maybe we can use "ulimit" ?

Then the test case can be:

    test_expect_success 'fetch the missing commit once' '
        ulimit -u 512 &&
        GIT_TRACE="$(pwd)/trace" git -C source fetch origin $oid 2>err &&
        ! grep "error: cannot fork" err &&
        test $(grep "fetch origin" trace | wc -l) -eq 1
     '

Without this fix, "git fetch" would finally succeed because we didn't
check the return value of promise_remote_get_direct(). We can find
the err output like this:

    error: cannot fork() for -c: Resource temporarily unavailable
    fatal: promisor-remote: unable to fork off fetch subprocess

And we can see a lot of trace logs as follows:

    trace: run_command: git -c fetch.negotiationAlgorithm=noop fetch
origin --no-tags --no-write-fetch-head --recurse-submodules=no
--filter=blob:none --stdin
    trace: built-in: git fetch origin --no-tags --no-write-fetch-head
--recurse-submodules=no --filter=blob:none --stdin

Thanks.
-Han Xin



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux