On Wed, Jul 03, 2013 at 02:41:06PM -0700, Junio C Hamano wrote: > John Keeping <john@xxxxxxxxxxxxx> writes: > > > My system doesn't have the en_US.UTF-8 locale (or plain en_US), which > > causes t4205 to fail by counting bytes instead of UTF-8 codepoints. > > > > Instead of using sed for this, use Perl which behaves predictably > > whatever locale is in use. > > > > Signed-off-by: John Keeping <john@xxxxxxxxxxxxx> > > --- > > This patch is on top of 'as/log-output-encoding-in-user-format'. > > Thanks. I think Alexey is going to send incremental updates to the > topic so I won't interfere by applying this patch on top of the > version I have in my tree. > > But I do agree that using Perl may be a workable solution. > > An alternative might be not to use this cryptic 3-arg form of > commit_msg at all. They are used only for these three: > > $(commit_msg "" "8" "..*$") > $(commit_msg "" "0" ".\{11\}") > $(commit_msg "" "4" ".\{11\}") > > I somehow find them simply not readable, in order to figure out what > is going on. > > Just using three variables to hold what are expected would be far > more portable and readable. > > # "anfänglich" whatever it means. > sample_utf8_part=$(printf "anf\303\244ng") > > commit_msg () { > msg="initial. ${sample_utf8_part}lich"; > if test -n "$1" > then > echo "$msg" | iconv -f utf-8 -t "$1" > else > echo "$msg" > fi > } > > And then instead of writing in the expected test output. > > $(commit_msg "" "8" "..*$") > $(commit_msg "" "0" ".\{11\}") > $(commit_msg "" "4" ".\{11\}") > > we can just say > > initial... > ..an${sample_utf8_part}lich > init..lich > > It is no worse than those cryptic 0, 4, 8 and 11 magic numbers we > see in the test, no? Yep! when I was thinking about Johannes's suggestions, I finally came to the decision alike yours. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html