Re: [PATCH v7 0/7] convert: add support for different encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 27, 2018 at 04:25:38PM -0500, Jeff King wrote:
> On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote:
> 
> > The other question is:
> > Would this help showing diffs of UTF-16 encoded files on a "git hoster",
> > github/bitbucket/.... ?
> 
> Almost. There's probably one more thing needed. We don't currently read
> in-tree .gitattributes when doing a diff in a bare repository. And most
> hosting sites will store bare repositories.
> 
> And of course it would require the users to actually set the attributes
> themselves.
> 
> > Or would the auto-magic UTF-16 avoid binary patch that I send out be more helpful ?
> > Or both ?
> > Or the w-t-e encoding ?
> 
> Of the three solutions, I think the relative merits are something like
> this:
> 
>   1. baked-in textconv (my patch)
> 
>      - reuses an existing diff feature, so minimal code and not likely to
>        break things
> 
>      - requires people to add a .gitattributes entry
> 
>      - needs work to make bare-repo .gitattributes work (though I think
>        this would be useful for other features, too)
> 
>      - has a run-time cost at each diff to do the conversion
> 
>      - may sometimes annoy people when it doesn't kick in (e.g.,
>        emailed patches from format-patch won't have a readable diff)
> 
>      - doesn't combine with other custom-diff config (e.g., utf-16
>        storing C code should still use diff=c funcname rules, but
>        wouldn't with my patch)
> 
>   2. auto-detect utf-16 (your patch)
>      - Just Works for existing repositories storing utf-16
> 
>      - carries some risk of kicking in when people would like it not to
>        (e.g., when they really do want a binary patch that can be
>        applied).

The binary patch is still supported, but that detail may need some more explanation
in the commit message. Please see  t4066-diff-encoding.sh
  test_expect_success 'diff --binary against local change' '
         cp file2 file &&
         test_tick &&
         cat >expect <<-\EOF &&
         diff --git a/file b/file
         index 26acf09b0aad19fb22566956d1a39cb4e2a3b420..e98d27acfb90cfcfc84fcc5173baa4aa7828290f 100644
         GIT binary patch
         literal 28
         ecmezW?;ArgLn;Fo!ykquAe{qbJq3!C0BHb{ln3Pi

         literal 32
         icmezW?+HT@Lpnn$kmO?c#!w7oaWVX1NCMJ1Ko$VA_z0~4

  EOF
         git diff --binary file >actual &&
         test_cmp expect actual

> 
>        I think it would probably be OK if this kicked in only when
>        ALLOW_TEXTCONV is set (the default for porcelain), and --binary
>        is not (i.e., when we would otherwise just say "binary
>        files differ").

The user can still use "git diff" (Where auto-detection of UTF-16 kicks in
and replaces "binary files differ" with an UTF-8 diff.
When the user wants a patch, "git diff --binary" will generate a binary patch,
as before.
The only thing which is missing is the line "binary files differ", which may be a
regression. I can re-add it in V2.

> 
>      - similar to (1), carries a run-time cost for each diff, and users
>        may sometimes still see binary diffs
> 
>   3. w-t-e (Lars's patch)
> 
>      - requires no server-side modifications; the diff is plain vanilla
> 
>      - works everywhere you diff, plumbing and porcelain
> 
>      - does require people to add a .gitattributes entry
> 
>      - run-time cost is per-checkout, not per-diff
> 
> So I can see room for (3) to co-exist alongside the others. Between (1)
> and (2), I think (2) is probably the better direction.
> 
> -Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux