> On 13 Dec 2017, at 19:11, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Lars Schneider <larsxschneider@xxxxxxxxx> writes: > >> ... In a perfect world I think I would store >> the encoding of a file in the tree object. I didn't pursue that solution >> as this would change the Git data model which would open a can of worms >> for a problem that not that many people have (almost everyone is on >> UTF-8 anyways). > > Having that "encoding" trailt recorded in the tree that contains the > blob would mean that the same blob can be recorded with one > "encoding" trait in a tree, and in a different tree it can be > recorded with a different "encoding" trait. I doubt it really makes > sense. The "blob object" would store the text data encoded in a canonical UTF-8 form. The "tree object" would store the encoding. On checkin Git would convert the text from the stored encoding to UTF-8 and on checkout it would do the reverse. That way you could control the encoding for a text file specific for each path similar to the "mode bits". That also means you could change the encoding of a file while the blob content stays the same. Changing the encoding of a file with the .gitattributes approach can be difficult if you have configured the attribute with a very broad pattern (e.g. *.foo). You would either need to rename the file or limit the scope of your pattern. - Lars