On Fri, May 21, 2021 at 06:45:02AM +0900, Junio C Hamano wrote: > > 2. This tightening is actually important if we want to avoid letting > > people _intentionally_ introduce the unreachable-but-incomplete > > scenario. Without it, an easy denial-of-service corruption against > > a repository you can push to is: > > > > - push an update to change a ref from X to Y. Include all objects > > necessary for X..Y, but _also_ include a tree T which points to > > a missing blob B. This will be accepted by the current rules > > (but not by your patch). > > > > - push an update to change the ref from Y to C, where C is a > > commit whose root tree is T. Your patch allows this (because we > > already have T in the repository). But the resulting repository > > is corrupt (the ref now points to an incomplete object graph). > > Hmph, the last step of that attack would not work with our current > check; is this the same new hole the series brings in as you > explained earlier for a case where a newly pushed tree/commit starts > to reference a left-over dangling tree already in the repository > whose content blobs are missing? Right. The current state is immune to this attack; the rule is "refs must be complete, but nothing else has to be". The state after Patrick's series is (I think) likewise immune; the rule is "all objects must be complete". I'm not sure if that was intentional, or a lucky save because the series tries to optimize both parts (both which objects we decide to check, as well as which we consider we "already have"). But it's quite subtle. :) > > If we wanted to keep the existing rule (requiring that any objects that > > sender didn't provide are actually reachable from the current refs), > > then we'd want to be able to check reachability quickly. And there I'd > > probably turn to reachability bitmaps. > > True. As we are not "performance is king---a code that corrupts > repositories as quickly as possible is an improvement" kind of > project, we should keep the existing "an object can become part of > DAG referred by ref tips only when the objects it refers to all > exist in the object store, because we want to keep the invariant: an > object that is reachable from a ref is guaranteed to have everything > reachable from it in the object store" rule, and find a way to make > it fast to enforce that rule somehow. That's my feeling, too. A rule of "you are not allowed to introduce objects which directly reference a missing object" would be likewise correct, if consistently enforced. But it makes me nervous to switch from one to the other, especially in just one place. And I guess in that sense, this series isn't immune as I said earlier. It looks like a push from a shallow clone would use the old "reachable from refs" check even after this series. -Peff