Jan-Philip Gehrcke <jgehrcke@xxxxxxxxxxxxxx> writes: > I was surprised to see that the output of > > git log --encoding=utf-8 "--format=format:%b" > > can contain byte sequences that are invalid in UTF-8. Note: I am using > git 2.1.4 and the %b format specifier represents the commit message > body. Yeah, if the original was bad and cannot be sanely expressed in UTF-8, you have two options. You can show the contents as raw bytes recorded in the object with a warning so that the user can use it as such (e.g. perhaps the original was indeed an iso8859-2 but was incorrectly marked as UTF-8, or something like that, and a human that is more intelligent than a tool _could_ guess and attempt to recover). Or you can error out and refuse to produce output. We deliberately made a design choice to take the former option. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html