Re: [PATCH/RFC v2] Document format of basic Git objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2012/2/19 Junio C Hamano <gitster@xxxxxxxxx>:
>>  - Do we assume tag/commit header in utf-8 or ascii?
>
> Author-ident is typically utf-8 already, so you cannot assume "ASCII".

I wonder if anyone puts non utf-8 strings in there, or could we
enforce utf-8 (i.e. validate and reject non utf-8 strings) and accept
encoded word syntax (rfc 2047) with the help of the new
$GIT_IDENT_ENCODING variable. The "accept ..." part can wait until
someone is hit by "utf-8 only" check and steps up.

By the same reasoning, maybe we should declare tag content is utf-8
only, until someone needs and adds "encoding" support for it.

>> +The filename may be an arbitrary nonempty string of bytes, as long as
>> +it contains no '/' or NUL character.
>
> s/, as long as it contains no/; it cannot contain any/

Pathname also cannot be "." nor "..", I suppose.
Since we also support Windows, should '\\' be banned too? ... probably
not worth it.

>> +The header must not contain NUL.
>
> I vaguely recall that you made sure neither the header nor the body
> contains NUL.

One of the purposes of this document is to note all constraints and
limitations (another one is a reference for users who want to dig deep
in git data structure without looking at the code). The problem with
handling NUL probably only exists in C Git (and maybe libgit2). I'll
turn that to "should not contain NUL".
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]