Re: [PATCH] git gui: fix branch name encoding error on git gui

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/12/19 11:15AM, Junio C Hamano wrote:
> 加藤一博 <kato-k@xxxxxxxxxxxxx> writes:
> 
> > After "git checkout -b '漢字'" to create a branch with UTF-8
> > character in it, "git gui" shows the branch name incorrectly,
> > as it forgets to turn the bytes
> > read from the "git for-each-ref" and
> > read from "HEAD" file
> > into Unicode characters.
> 
> Thanks.
> 
> Note to the git-gui mentainer.  The above may want to be
> line-wrapped a bit.

Thanks. Already done :)
 
> > Signed-off-by: Kazuhiro Kato <kato-k@xxxxxxxxxxxxx>
> > ---
> >  git-gui.sh     | 1 +
> >  lib/branch.tcl | 2 ++
> >  2 files changed, 3 insertions(+)
> >
> > diff --git a/git-gui.sh b/git-gui.sh
> > index 0d21f56..8f4a9ae 100755
> > --- a/git-gui.sh
> > +++ b/git-gui.sh
> > @@ -684,6 +684,7 @@ proc load_current_branch {} {
> >  	global current_branch is_detached
> >  
> >  	set fd [open [gitdir HEAD] r]
> > +	fconfigure $fd -translation binary -encoding utf-8
> >  	if {[gets $fd ref] < 1} {
> >  		set ref {}
> >  	}
> 
> A comment totally outside the scope of this fix to anybody
> interested in further working on this code.
> 
> This piece of code is way too intimate with the implementation
> details of HEAD and yet not intimate enough to know that HEAD can be
> a symlink (in other words, it is a poor imitation of the real logic
> implemented in git core).  A kosher way to implement this would be
> to call
> 
> 	git symbolic-ref --quiet --short HEAD
> 
> which would succeed and give the branch name to its standard output,
> or would fail when the head is detached.  Set "current_branch" and
> "is_detached" according to the outcome.

It was introduced in fc4e8da (git-gui: Internalize symbolic-ref HEAD 
reading logic, 2007-05-30). The commit message is:

  To improve performance on fork+exec impoverished systems (such as
  Windows) we want to avoid running git-symbolic-ref on every rescan
  if we can do so.  A quick way to implement such an avoidance is to
  just read the HEAD ref ourselves; we'll either see it as a symref
  (starts with "ref: ") or we'll see it as a detached head (40 hex
  digits).  In either case we can treat that as our current branch.

Now I'm not sure how relevant this still is over 12 years later, but 
AFAIK a fork+exec is still very costly on Windows.

So I wonder whether we should manually check if HEAD is a symbolic link 
or we should just use git-symbolic-ref and hope the performance doesn't 
drop too much.
 
> And yes, Kato-san's fconfigure fix in this patch will still be
> relevant even after such a fix to the implementation of this proc.

Me and Jonathan (Cc) have been having a discussion [0] about whether 
hard-coding UTF-8 as the refname encoding is the right idea. The gist of 
it is that Git _technically_ allows refnames to be in other encodings as 
long as the strings are NULL terminated. It does not restrict itself to 
valid UTF-8 only. More details in the linked thread, of course.

My position is that we should default to UTF-8 given its popularity (at 
least in the Git world), but I'm wondering whether we should also add a 
config variable to allow users to configure their encodings.

If you don't mind, your thoughts on this would be appreciated :)

[0] https://github.com/prati0100/git-gui/pull/21

-- 
Regards,
Pratyush Yadav



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux