On Wed, Mar 22, 2023 at 05:28:49PM +0100, Sjur Moshagen wrote: > Thank you for filling out a Git bug report! > Please answer the following questions to help us understand your issue. > > What did you do before the bug happened? (Steps to reproduce your issue) > git clone https://github.com/giellalt/lang-sma > > What did you expect to happen? (Expected behavior) > Clone to be clean, as reported by git status > > What happened instead? (Actual behavior) > git status reported four changed files > > What's different between what you expected and what actually happened? > Nothing except those four files > > Anything else you want to add: > This only happens on an M2 Macbook Pro. With Apple's git (1.37.1), a huge number of files were reported as modified. > > Please review the rest of the bug report below. > You can delete any lines you don't wish to share. > > > [System Info] > git version: > git version 2.40.0 > cpu: arm64 > no commit associated with this build > sizeof-long: 8 > sizeof-size_t: 8 > shell-path: /bin/sh > feature: fsmonitor--daemon > uname: Darwin 22.3.0 Darwin Kernel Version 22.3.0: Mon Jan 30 20:39:46 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T6020 arm64 > compiler info: clang: 14.0.0 (clang-1400.0.29.202) > libc info: no libc information available > $SHELL (typically, interactive shell): /bin/zsh > > > [Enabled Hooks] > As a general note, the current .gitattributes file can still be improved. The general rule and recommendation would be to start the text attributes definition by using a "catch all" rule, followed by files that need special treatment. The very first text/binary line would be * text=auto which will tell Git to auto-detect all files (and file types), see below, This basically prevents corruption of binary files. which are not mentioned later. Like README.md, LICENCE and so on. Then there is a list of file extension, that are known to be binary, you can add them as shown below. =============================== # List of files to be included in GitHub statistics. # DO NOT EDIT - the file is updated via the template system. # Some defaults: * text=auto *.aif binary *.aiff binary *.docx binary *.fomabin binary *.gz binary [snip] *.sh text eol=lf =================================== Now, lets look at the at the master branch of the repo: commit 4e0d949dbbd13a4dd9285ae3dee63abc822b805a (HEAD -> main, origin/main, origin/HEAD) Author: Sjur N Moshagen <sjurnm@xxxxxxx> Please run git ls-files --eol And you will see a line like this: i/-text w/-text attr/text eol=lf corp/SNÅSNINGEN 2014/43 uke/Næringsliv.doc The "i/-text" will tell us thet Git identified the file, as commited into the repo, as binary ("non-text"). The "w/-text" will tell us thet Git identified the file, as seen on disk, in you working tree, as binary ("non-text"). So everything looks fine, so far. But the comes "attr/text eol=lf", wich tells Git to ignore its feeling, and possible corrupt the file at the next commit, converting CRLF into LF. As an exmple, the "corp/SNÅSNINGEN 2014/43 uke/Næringsliv.doc" will be corrupted. It may or not be reported as changed. after a fresh clone. And others as well. But in any case, it will be corrupted when you commit it. Oops. And why is that? Because people had forgotten a line like *.doc binary for the doc files. To put it in harsh words: I would call the .gitattributes file broken. Please consider to change the default to auto, as pointed out by others as well. My version of a .gitattributes would look like this: ================= # Some defaults: # Stay on the safe type for files that may be added later * text=auto # shell files need lf *.sh text eol=lf # Add other file types, like *.gz # Include in statistics, no syntax colour ATM, classified as text instead: [snip] ====================