Re: gitweb forgets to send utf8 header for raw blob views

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jan Engelhardt wrote:
[utf8 does not work] for raw blob views like
http://dev.medozas.de/gitweb.cgi?p=hxtools;a=blob_plain;f=bin/git-forest;hb=HEAD

Gitweb should probably not be recoding blobs, so the best I can think of is check for UTF-8 validity and add charset=utf-8 in that case (and in other cases leave the charset undeclared).

The drawback with that is that we cannot send plain blobs without reading them into memory (or reading them twice), since we have to check for UTF-8 validity of the whole blob before sending it. (Gitweb is currently reading the whole blob into memory, but that's unnecessary and could be changed in the future.)

After my next refactoring, there *might* be some chance to easily implement something like "if it's smaller than x KB (e.g. 512), read it into memory, check for valid UTF-8 and optionally add charset=utf-8, otherwise don't read it into memory and send it without charset=utf-8 [or perhaps check for BOM presence at the beginning]." I'll remember if/when it comes up in my refactoring and get back to the mailing list about it.

-- Lea
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux