Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes: > On Sun, Oct 23, 2011 at 4:51 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > ... >> The low level object format of our commit is textual header fields, each >> of which is terminated with a LF, followed by a LF to mark the end of >> header fields, and then opaque payload that can contain any bytes. It does >> not forbid a non-Git application to reuse the object store infrastructure >> to store ASN.1 binary goo there, and the low level interface we give such >> as cat-file is a perfectly valid way to inspect such a "commit" object. > > cat-file is fine, commit-tree (or any commands that call > commit_tree()) cuts at NUL though. > I wonder how git processes commit messages in utf-16. That is exactly what I am saying. Perhaps you didn't either read or understand what you omitted from your quoting; otherwise you even wouldn't have brought up utf-16. Let me requote that part for you. > But when it comes to "Git" Porcelains (e.g. the log family of commands), > we do assume people do not store random binary byte sequences in commits, > and we do take advantage of that assumption by splitting each "line" at > LF, indenting them with 4 spaces, etc. In other words, a commit log in the > Git context _is_ pretty much text and not arbitrary byte sequence. Think what would cutting at a byte whose value is 012 and adding four bytes whose values are 040 to each of "lines" that formed with such cutting do to UTF-16 goo, even if it does not contain any NUL byte. As far as Git Porcelains are concerned, it is no different from random binary byte sequences. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html