2009/8/4 Junio C Hamano <gitster@xxxxxxxxx>: > > Thanks, Zoltán. > > We should be able to set up a script that scrapes the output to test this > kind of thing. We may not want to have a test pattern that matches too > strictly for the current structure and appearance of the output > (e.g. counting nested <div>s, presentation styles and such), but if we can > robustly scrape off HTML tags (e.g. "elinks -dump") and check the > remaining payload, it might be enough. > > Jakub what do you think? I suspect that scraping approach may turn out to > be too fragile for tests to be worth doing, but I am just throwing out a > thought. > This issue comes out when chop_and_escape_str function is called with a non-ascii string (like my name :)) without before calling to_utf8 on it. "author_name" and "committer_name" are two examples, and "author_name" shows up with bad encoding in HTML. Example from one of my repos (little piece from shortlog output): <td class="author"><span title="Füzesi Zoltán">Füzesi Zoltán</span></td> After applying the patch: <td class="author">Füzesi Zoltán</td> This is an "old" (seen in 1.5.6 version too) and (I think) minor issue. I haven't spent time on thinking how a test script could show this yet. Waiting for Jakub's reaction. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html