On Wed, Sep 11, 2019 at 08:04:14PM -0400, Jeff King wrote: > When the client has asked for certain shallow options like > "deepen-since", we do a custom rev-list walk that pretends to be > shallow. Before doing so, we have to disable the commit-graph, since it > is not compatible with the shallow view of the repository. That's > handled by 829a321569 (commit-graph: close_commit_graph before shallow > walk, 2018-08-20). That commit literally closes and frees our > repo->objects->commit_graph struct. A few notes and curiosities on my patch and this general area. The test suite passes with my patch both with and without GIT_TEST_COMMIT_GRAPH=1. But to my surprise, it also passes if I delete the close_commit_graph() line added by 829a321569! So it's not clear to me whether this whole thing is truly unnecessary (and Stolee was just being overly cautious because the code is related to shallow-ness, even though it is OK doing a true-parent traversal itself), or if we just don't have good test coverage for the case that requires it. If it _is_ necessary, I'm a little worried there are other problems lurking. The whole issue is that we've seen and parsed some commits before we get to this shallow deepen-since code-path. So just disabling commit-graphs isn't enough. Even without them, we might have parsed some commits the old-fashioned way and filled in their parent pointers. Is that a problem? If so, then I think we either have to "unparse" all of the existing in-memory commits, or call out to a rev-list process with a fresh memory space. One especially interesting bit is this: > +# - we must use protocol v2, because it handles the "have" negotiation before > +# processing the shallow direectives In the v0 protocol, we handle shallows at the end of receive_needs(). So it happens before we call into get_common_commits(), which may do further traversal. That makes it much more likely for the shallow code to see a "clean" slate. Should v2 be ordering things in the same way? But then would the have negotiation see the shallow parents? If so, is that good or bad? I'm pretty ignorant of how the shallow bits of upload-pack are supposed to work. > That creates an interesting problem for commits that have _already_ been > parsed using the commit graph. Their commit->object.parsed flag is set, > their commit->graph_pos is set, but their commit->maybe_tree may still > be NULL. When somebody later calls repo_get_commit_tree(), we see that > we haven't loaded the tree oid yet and try to get it from the commit > graph. But since it has been freed, we segfault! I was surprised we ever called repo_get_commit_tree() at all, since we're literally just traversing commits here. It looks like list-objects.c is very happy to queue pending trees for each commit, even if we're just going to throw them away when we get to process_tree()! I wonder if could be checking revs->tree_objects here and saving ourselves some work. > The new test in t5500 triggers this segfault, but see the comments there > for how horribly intimate it has to be with how both upload-pack and > commit graphs work. This bug was triggered by a real world case. I'm not entirely clear what the client-side state was; I had to reverse engineer the whole thing out of the server-side coredump. The test here is my attempt to cut it down to a minimum. I don't like having to manually generate the upload-pack input, but I had trouble finding a fetch command that would trigger it. And anyway, given how weirdly specific the requirements are for generating this case, it seemed sensible to me to keep the input close to the source of the problem. -Peff