Re: [PATCH v2 3/3] sequencer: reencode to utf-8 before arrange rebase's todo list

Jeff King <peff@xxxxxxxx> · Thu, 7 Nov 2019 00:56:44 -0500

On Wed, Nov 06, 2019 at 05:03:22PM +0700, Danh Doan wrote:

> On 2019-11-05 23:03:14 -0500, Jeff King wrote:
> > >     3. You are dealing with a project originated on and migrated
> > >        from a foreign SCM, and older parts of the history is stored
> > >        in a non-utf-8, even though recent history is in utf-8
> > > 
> > > to the mix?
> > 
> > I would think you'd want to convert to utf-8 as you do the migration in
> > that case, since you're writing new hashes anyway.
> 
> Sorry but I'm confused.
> If we're migrating from foreign SCM, we could make our commit in
> utf-8 (convert their commit message to utf8).
> Even if we need to synchronise history between the foreign SCM in
> question with git, we could use i18n.logoutputencoding for the output
> comestic.

Right, that's the same thing I'm suggesting.

> > But I think a similar
> > case would just be an old Git repository, where for some reason you
> > thought i18n.commitEncoding was a good idea back then (perhaps because
> > you were in situation (1) then, but now you aren't).
> > 
> > In either case, though, I don't think it's a compelling motivation for
> > optimization, if only because those old commits will be shown less and
> > less (and even without modern optimizations like commit-graph, we'd
> > generally avoid reencoding those old commits unless we're actually going
> > to _show_ them).
> 
> I'm not sure if we're misunderstood each other.
> I've only suggested to encode _new_ commit from now on in utf-8.
> Reencoding old history in utf-8 is definitely not in that suggestion.

Yes. My point was that's _already_ the default behavior, unless you
explicitly set some config asking for non-utf8 commit objects. And I
don't think there's any good reason to set that these days.

-Peff