On 2019-11-07 03:03:07 -0500, Jeff King wrote: > On Thu, Nov 07, 2019 at 02:48:58PM +0700, Danh Doan wrote: > > > > # and if we resolve and commit, presumably we'd get a broken commit, > > > # with iso8859-1 and no encoding header > > > echo resolved >file > > > git add file > > > GIT_EDITOR=: git rebase --continue > > > -- 8< -- > > > > > > But somehow it all seems to work. The resulting commit has real utf8 in > > > it. I'm not sure if we pull it from the original commit via "commit -c", > > > > Yes, somehow it worked. But, without this patch, git also warns: > > > > % GIT_EDITOR=: git rebase --continue > > Warning: commit message did not conform to UTF-8. > > You may want to amend it after fixing the message, or set the config > > variable i18n.commitencoding to the encoding your project uses. > > > > Checking with strace (on glibc, musl strace can't trace execve): > > > > > [pid 12848] execve("/home/danh/workspace/git/git", ["/home/danh/workspace/git/git", "commit", "-n", "-F", ".git/rebase-merge/message", "-e", "--allow-empty"], 0x558fb02e8240 /* 51 vars */) = 0 > > > > Turn out, it's because of: commit.c::verify_utf8 > > > > /* > > * This verifies that the buffer is in proper utf8 format. > > * > > * If it isn't, it assumes any non-utf8 characters are Latin1, > > * and does the conversion. > > */ > > static int verify_utf8(struct strbuf *buf) > > > > Hence, your test is just pure luck (because it's in latin1). > > Ah, thanks for resolving that mystery. Is it worth turning the scenario > above into a test? Yes, it's worth to have a test. In fact, I found another breakage (rebase with encoding) while writing this test. I'll delay the re-roll a bit to include that breakage. -- Danh