Am 23.09.19 um 12:04 schrieb Alexandr Miloslavskiy via GitGitGadget: > From: Alexandr Miloslavskiy <alexandr.miloslavskiy@xxxxxxxxxxx> > > After I discovered that UTF-16-LE-BOM test was bugged and still > succeeded, I decided that better tests are required. Possibly the best > option here is to compare git results against hardcoded ground truth. > > The new tests also cover more interesting chars where (ANSI != UTF-8). What are we testing here? Is there some back-and-forth conversion going on, and are we testing that the conversion happens at all, or that the correct conversion/encoding is picked, or that the conversion that is finally chosen is correct? Why does it help to test more interesting chars (and would you not also regard codepoints outside the BMP the most interesting because they require surrogate codepoints in UTF-16)? -- Hannes