W dniu 31.03.2017 o 14:38, Torsten Bögershausen pisze: > On 30.03.17 21:35, Jakub Narębski wrote: >> Hello, >> >> Recently I had to work on a project which uses legacy 8-bit encoding >> (namely cp1250 encoding) instead of utf-8 for text files (LaTeX >> documents). My terminal, that is Git Bash from Git for Windows is set >> up for utf-8. >> >> I wanted for "git diff" and friends to return something sane on said >> utf-8 terminal, instead of mojibake. There is 'encoding' >> gitattribute... but it works only for GUI ('git gui', that is). >> >> Therefore I have (ab)used textconv facility to convert from cp1250 of >> file encoding to utf-8 encoding of console. >> >> I have set the following in .gitattributes file: >> >> ## LaTeX documents in cp1250 encoding >> *.tex text diff=mylatex >> >> The 'mylatex' driver is defined as: >> >> [diff "mylatex"] >> xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$" >> wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" >> textconv = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8 >> cachetextconv = true >> >> And everything would be all right... if not the fact that Git appends >> spurious ^M to added lines in the `git diff` output. Files use CRLF >> end-of-line convention (the native MS Windows one). >> >> $ git diff test.tex >> diff --git a/test.tex b/test.tex >> index 029646e..250ab16 100644 >> --- a/test.tex >> +++ b/test.tex >> @@ -1,4 +1,4 @@ >> -\documentclass{article} >> +\documentclass{mwart}^M >> >> \usepackage[cp1250]{inputenc} >> \usepackage{polski} >> >> What gives? Why there is this ^M tacked on the end of added lines, >> while it is not present in deleted lines, nor in content lines? >> >> Puzzled. >> >> P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity >> that it doesn't supports in core `encoding` attribute together with >> having `i18n.outputEncoding`. > > Is there a chance to give us a receipt how to reproduce it? > A complete test script or ? > (I don't want to speculate, if the invocation of iconv is the problem, > where stdout is not in "binary mode", or however this is called under Windows) I'm sorry, I though I posted whole recipe, but I missed some details in the above description of the case. First, files are stored on filesystem using CRLF eol (DOS end-of-line convention). Due to `core.autocrlf` they are converted to LF in blobs, that is in the index and in the repository. Second, a textconv with filter preserving end-of-line needs to be configured. I have used `iconv`, but I suspect that the problem would happen also for `cat`. In the .gitattributes file, or .git/info/attributes add, for example: *.tex text diff=myconv In the .git/config configure the textconv filter, for example: [diff "myconv"] textconv = iconv.exe -f cp1250 -t utf-8 Create a file which filename matches the attribute line, and which uses CRLF end of line convention, and add it to Git (adding it to the index): $ printf "foo\r\n" >foo.tex $ git add foo.tex Modify file (also with CRLF): $ printf "bar\r\n" >foo.tex Check the difference $ git diff foo.tex HTH -- Jakub Narębski