On Wed, Nov 06, 2019 at 10:30:34AM +0900, Junio C Hamano wrote: > > That's normally what we do. The only cases we're covering here are when > > somebody has explicitly asked that the commit object be stored in > > another encoding. Presumably they'd also be using a matching > > i18n.logOutputEncoding in that case, in which case logmsg_reencode() > > would be a noop. I think the only reasons to do that are: > > > > 1. You're stuck on some legacy encoding for your terminal. But in that > > case, I think you'd still be better off storing utf-8 and > > translating on the fly, since whatever encoding you do store is > > baked into your objects for all time (so accept some slowness now, > > but eventually move to utf-8). > > > > 2. Your preferred language is bigger in utf-8 than in some specific > > encoding, and you'd rather save some bytes. I'm not sure how big a > > deal this is, given that commit messages don't tend to be that big > > in the first place (compared to trees and blobs). And the zlib > > deflation on the result might help remove some of the redundancy, > > too. > > Perhaps add > > 3. You are dealing with a project originated on and migrated > from a foreign SCM, and older parts of the history is stored > in a non-utf-8, even though recent history is in utf-8 > > to the mix? I would think you'd want to convert to utf-8 as you do the migration in that case, since you're writing new hashes anyway. But I think a similar case would just be an old Git repository, where for some reason you thought i18n.commitEncoding was a good idea back then (perhaps because you were in situation (1) then, but now you aren't). In either case, though, I don't think it's a compelling motivation for optimization, if only because those old commits will be shown less and less (and even without modern optimizations like commit-graph, we'd generally avoid reencoding those old commits unless we're actually going to _show_ them). > I suspect even the heavy Windows/Mac users in Japan have migrated > out of legacy (the suspicion comes from an anecdote that is offtopic > here). Thanks for the data point. All of this is very far from my personal experience, so I mostly go on scraps of hearsay I pick up reading this or that. :) -Peff