Re: [PATCH] gitweb: parse_commit_text encoding fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2009/8/4 Junio C Hamano <gitster@xxxxxxxxx>:
>
> Thanks, Zoltán.
>
> We should be able to set up a script that scrapes the output to test this
> kind of thing.  We may not want to have a test pattern that matches too
> strictly for the current structure and appearance of the output
> (e.g. counting nested <div>s, presentation styles and such), but if we can
> robustly scrape off HTML tags (e.g. "elinks -dump") and check the
> remaining payload, it might be enough.
>
> Jakub what do you think?  I suspect that scraping approach may turn out to
> be too fragile for tests to be worth doing, but I am just throwing out a
> thought.
>

This issue comes out when chop_and_escape_str function is called with
a non-ascii string (like my name :)) without before calling to_utf8 on
it. "author_name" and "committer_name" are two examples, and
"author_name" shows up with bad encoding in HTML.

Example from one of my repos (little piece from shortlog output):
<td class="author"><span title="Füzesi Zoltán">Füzesi Zoltán</span></td>
After applying the patch:
<td class="author">Füzesi Zoltán</td>

This is an "old" (seen in 1.5.6 version too) and (I think) minor issue.
I haven't spent time on thinking how a test script could show this yet.
Waiting for Jakub's reaction.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]