Junio C Hamano <gitster@xxxxxxxxx> writes: > "Robin H. Johnson" <robbat2@xxxxxxxxxx> writes: > >> from CVS to Git (we're very close now), we've decided that the signed >> pushes will provide better security than our plan of previous plan of >> using signed notes, so we'd like to see signed pushes succeed. > > Could you elaborate on your "previous plan" a bit? What is a signed note, > how would it help validate the authenticity, how do developers interact > using it and what do you perceive as weaknesses compared to the signed > push that we discussed a few weeks ago? I was hoping that I could get another food-for-thought from people who thought about signed commits before sending this out, but here is my current thinking (I retitled your "Signed push progress" as there is nothing to "progress" on without an active discussion). Originally, I very much wanted to like the approach of v3 that was meant to be simpler by having the logic to record the signed push certificate only on the sending end. At the mechanism level, v3 looked simpler to me, but from the point of view of end users, I doubt that it is simpler than the approach of v2, where the sender prepares a push record, signs and sends it, and the receiver records it to its notes tree to publish so that others can fetch and verify. Here are some of the issues, from end users' point of view, that v3 would have that v2 would not, off the top of my head: - Unless you are pushing into a repository solely for your own push, you have to first fetch the notes ref from where you are about to push to, then hope that your push does not conflict with others. If your push is rejected for non-fast-forward of the signed-push notes tree (but not for your real branches), you would have to rewind the push certificate the failed "git push" prepared (you could probably add a patch to the v3 to do so automatically, but I haven't looked closely for all possible failure cases), run "git fetch", and then run "git push -s" again which would create a new push record for you to sign. Because the "signed-push" namespace for notes is meant to cover all the branches, this will not work on a busy site that uses CVS/SVN style "shared central repository" workflow at all. - If you are pushing into multiple places, you would somehow need to configure your end to keep one signed-push notes tree per remote that you intend to push to with signature, to avoid contaminating remote repositories of records of your push into other remote repositories. It could be worked around by even more code on the sending end, but the need for configuring alone is already an additional mental burden. - It also was hoped that pre-receive or pre-update hook on the receiving end can be used to authenticate and authorize the push itself with the approach by v3, but when the check happens, the signed-notes tree to be used for verification is not connected to any ref in the refs/notes/ hierarchy yet (otherwise it won't be pre-* hook). The query interface "git notes show" needs to be updated so that it takes not just a ref via the GIT_NOTES_REF interface, which is defined to specify a ref because some subcommands of "git notes" need to create a new commit and update it, but a bare notes tree commit object name [*1*]. We may need to update "git notes" (at least "show" subcommand) for the use of receiving end; v3 is no longer a simpler "sender only" solution. I've shown how both v2 and v3 models would look to the end users with working code, thought about the pros and cons probably longer than anybody else, and at this point, if I were to choose between the two approaches [*2*], I am inclined to suggest that we go with the v2 model [*3*]. Either that, or we will see follow-up patches to work around the above (and there may be others we may later discover) issues from people who still think v3 is a better approach. Whether we go with v2 or v3, for people who want to verify the commits against signed push certificates stored in notes tree: - We need a wrapper like "tag --verify". - We also need a way to merge these signed pushed certificates [*4*]. I think the default notes merge is to concatenate, which would result in duplicates of the records that was present in the common ancestor (and no, "union" merge is not a safe way to remove these duplicates). But see footnote *2* below. [Footnotes] *1* I wouldn't be surprised if it already worked when you give the object name of the notes-tree commit to GIT_NOTES_REF when running "git show", but that is not really a documented interface and working by accident. The environment variable was designed to take a name of the ref. *2* I say "if I were to choose between the two" for a reason. It will make things simpler if we drop "add signature separately to notes" altogether, and instead adopt a "signed commit" approach. Embed GPG signature in a commit object, and allow the receiving end of the "push" to be configured to reject a push that tries to place a non-signed commit at the tip of a ref. The same "tip of a ref must be a signed commit" check can be done for "fetch". The most attractive part of the "signed commit" approach is that it does not force Linus to fetch push-signature notes trees from his lieutenants, and merge them to his push-signature notes tree, which is an unnecessary chore. The most likely thing to happen, especially under v3 design, would be that higher level maintainers will not bother to fetch/verify/merge the signed-push notes trees from their feeders, and the final publishing site will only have the push certificates from the owner of the repository at the top-level, without downstream contributors' signature. The v2 design already relies on the final verifier to independently collect signed-push notes from publishing repositories of Linus and all the key repositories Linus pulled from before verification, which feels more cumbersome, but the same needs to be done in v3 if the higher level maintainers do not buy in the fetch/verify/merge overhead for the push-signature notes tree. If signatures are embeded in the commits themselves, the issue of merging push-signature notes tree disappears. Whenever the top level maintainer pulls from his lieutenants, his fetch can (and should) check the signature of these lieutenants, and their signatures stay in the history the top level maintainer integrates and eventually pushes out with his own signature. As to the embedding of the signature in the commit, I am inclined to put the lines of GPG detached signature in new header lines, after the standard tree/parent/author/committer headers, instead of tucking it at the end of the commit log message text, for multiple reasons: - The signature won't clutter output from "git log" and friends if it is in the extra header. If we place it at the end of the log message, we would need to teach "git log" and friends to strip the signature block with an option. - Teaching new versions of "git log" and "gitk" to optionally verify and show signatures is cleaner if we structurally know where the signature block is (instead of scanning in the commit log message). - The signature needs to be stripped upon various commit rewriting operations, e.g. rebase, filter-branch, etc. They all already ignore unknown headers, but if we place signature in the log message, all of these tools (and third-party tools) also need to learn how a signature block would look like. - When we added the optional encoding header, all the tools (both in tree and third-party) that acts on the raw commit object should have been fixed to ignore headers they do not understand, so it is not like that new header would be more likely to break than extra text in the commit. *3* Honestly speaking, I myself was disturbed by the v2 model where the signed-push is recorded primarily at the receiver. At the philosophical level, the approach seems to go very much against the "distributed" nature of Git. Sending what you want to be committed to the server and having the server make a commit feels so very SVN/CVS. The usual "push" workflow for Git users is to fetch first to come close to the other end, integrate your work to prepare what you push out contains all of what the other end has, and then pushing the result out, hoping that you are the latest and nobody else had a chance to update the same thing. But after thinking about it a bit more, I came to realize that the record of push is quite unlike the branches you and others work on. First of all, you are recording what happens at the receiving end, "I updated these refs to these values in this repository". Making the record at the receiving end is more natural than writing "I plan to update these refs", sending it over to a dumb receiver and hoping it will fast-forward. Don't get me wrong. The offline distributed workflow is a great enabler in a distributed system like Git. The work you do on your branches can be fully asynchronous to outside world, and being able to have local history that can later be merged in order to avoid losing work by others while still allowing us to be asynchronous is a great thing to have, but the act of pushing the end result and recording the fact you pushed them into a repository is inherently a synchronous event---you and the receiving repository have to be connected when your "push" happens. I do not see a need to be dogmatic and insist that everything we do is asynchronous. *4* For the purpose of pushing things out, even with the v3 design, a pusher does not have to worry about merging the signed-push notes tree (there is no merge issue for pushers with the v2 model), I think. "git push -s" will add a push record to the notes tree, and if the result does not fast forward, "git fetch $remote +refs/notes/signed-push" (or use per-remote signed-push hierarchy "+refs/notes/$remote/signed-push") that discards the single failed push record would be all that is needed before attempting "git push -s" again without losing any information. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html