Han-Wen Nienhuys <hanwen@xxxxxxxxxx> writes: > Most places that write a reflog message use a message that ends with '\n'. > > Some places (the one mentioned here) do not append a '\n'. Just to make sure I am following, I see if (update_ref("initial pull", "HEAD", merge_head, ... in bultin/pull.c; this is supposed to be written with "\n" under the new world order? As reflog messages are supposed to be a single liner (due to the nature of the original storage format), I have a feeling that it is cumbersome to the programmers to requre "\n" at the end. It would not add much value to the readability of the resulting code, either. If the places that have hardcoded messages like the above have to be written like so: if (update_ref("initial pull\n", "HEAD", merge_head, ... it makes the code worse. Let's see if I can come up with a coherent "rule" that we can follow to improve the current situation that is an undisciplined mixture as you found out. ... goes and looks the current code ... So, what happens is that update->msg in each ref transaction (in refs.c::ref_transaction_add_update()) gets verbatim copy of what the caller gives the refs API, but when it is written out as an reflog entry, the message is cleansed by calling refs.c::copy_reflog_msg() that squashes a run of whitespaces (not just SP but including LF and HT) into a single SP and then trims whitespaces at the end. Then to the result, a LF is unconditionally added. I would consider the massaging of the application-supplied message done in copy_reflog_msg() as normalization, which turns possibly invalid data (like a message that has LF anywhere in it) into an acceptable form. Which defines the kosher form of a reflog message as not having any LF (or excess SP) in it, and the terminating LF is not part of the message. From that point of view, if we were to do something, I would rather standardize on not having the final (and only) LF. But some callers that prepare a message in a string buffer may find it easier not having to strip the trailing LF, or even depend on the fact that they can feed a string split into multiple lines and the result becomes a single line. So I do not see a point in "fixing" the existing callers that gives an extra LF at the end or other not-so-kosher form of data---these will be normalized by calling copy_reflog_msg() anyway. If they are easy to fix, like sending a constant string with LF at the end, we would want to fix them to improve the readability of the resulting code, though. If the update_ref() call in builtin/pull.c were sending "initial pull\n" as its first parameter, it would be worth fixing by removing the LF, as the result would be easier to read and it is not too much work. On the other hand, if the message that eventually becomes a reflog entry comes from outside and if the existing code relies on the sanitizing done by copy_reflog_msg(), I do not see a point in carefully removing the LF at the end on the caller's side, either. > I initially thought we could or should fix this across git-core, but I > think it will be a lot of work to find all occurrences. I think we are on the same page, even if the definition of which is correct might be different between us. As long as all the refs backend does the same sanitizing (which is probably easy---just call copy_reflog_msg() from them, or perhaps we may want to move the call to where update->msg are queued for each step in a transaction, which would relieve individual backends from having to worry about this issue at all), they would all behave the same for the same reflog data. Thanks.