Re: [PATCH v2] docs: rewrite the documentation of the text and eol attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alex Henrie <alexhenrie24@xxxxxxxxx> writes:

> These two sentences are confusing because the description of the text
> attribute sounds exactly the same as the description of the text=auto
> attribute:
>
> "Setting the text attribute on a path enables end-of-line normalization"
>
> "When text is set to "auto", the path is marked for automatic
> end-of-line conversion"
>
> Unless the reader is already familiar with the two variants, there's a
> high probability that they will think that "end-of-line normalization"
> is the same thing as "automatic end-of-line conversion".
>
> It's also not clear that the phrase "When the file has been committed
> with CRLF, no conversion is done" in the paragraph for text=auto does
> not apply equally to the bare text attribute which is described earlier.
> Moreover, it falsely implies that normalization is only suppressed if
> the file has been committed. In fact, running `git add` on a CRLF file,
> adding the text=auto attribute to the file, and running `git add` again
> does not do anything to the line endings either.
>
> On top of that, in several places the documentation for the eol
> attribute sounds like it can force normalization on checkin and checkout
> all by itself, but eol doesn't control normalization on checkin and
> doesn't control normalization on checkout either unless accompanied by
> the text attribute.
>
> Rephrase the documentation of text, text=auto, eol, eol=crlf, and eol=lf
> to be clear about how they are the same, how they are different, and in
> what cases normalization is performed.

All good observations, appropriate motivation of the change.

> Signed-off-by: Alex Henrie <alexhenrie24@xxxxxxxxx>
> ---
> v2: rewrite completely and rewrite the eol documentation too
> ---

Thanks.  I wonder helped-by tboegi@ might be in order, but hopefully
it may come during the review of this round.

> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 39bfbca1ff..bcea84f439 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -120,20 +120,22 @@ repository upon 'git add' and 'git commit'.
>  `text`
>  ^^^^^^
>  
> -This attribute enables and controls end-of-line normalization.  When a
> -text file is normalized, its line endings are converted to LF in the
> -repository.  To control what line ending style is used in the working
> -directory, use the `eol` attribute for a single file and the
> -`core.eol` configuration variable for all text files.
> -Note that setting `core.autocrlf` to `true` or `input` overrides
> -`core.eol` (see the definitions of those options in
> -linkgit:git-config[1]).
> +This attribute marks the path as a text file, which enables end-of-line
> +normalization on checkin and possibly also checkout: When a matching
> +file is added to the index, even if it has CRLF line endings in the
> +working directory, the file is stored in the index with LF line endings.
> +Conversely, when the file is copied from the index to the working
> +directory, its line endings may be converted from LF to CRLF depending
> +on the `eol` attribute, the Git config, and the platform (see
> +explanation of `eol` below).

Four points.

1. Overall, the new text is clearly written, and I think it made the
   updated description of "Set" and "Set to string 'auto'" easier to
   understand.

2. "on checkin and possibly also checkout" may be redundant, as the
   section in which the `text` attribute is described is all about
   various conversions upon checking in (i.e. moving contents to the
   index) and checking out (i.e. moving contents out of the index).
   The introductory text for the whole section that does not define
   what "check-in" and "check-out" are and instead loosely says
   "commands such as" may want to be clarified, but that can be done
   as a follow-on clean-up work after this patch lands.   

3. It is unclear why we would want to mention "even if it has CRLF
   line endings in the working directory".  "Regardless of the line
   endings in the working tree files" may slightly be better [*1*],
   but I think the text reads better without it as storing contents
   with possibly different end-of-line convention is the point of
   "end-of-line normalization" in the first place.

   "the file is stored in the index with LF line endings" may better
   be rephrased to "the contents is stored, after converting to use
   LF line endings if necessary, in the index", so that we can use
   the verb "convert" to stress the point of setting this attribute.

4. This problem is shared with the original text, but "This
   attribute marks the path as a text file" might be a bit
   misleading.  If it is left unspecified (i.e. unmentioned, or any
   earlier settings by explicitly overriding them with "!text") or
   worse yet, explicitly unset (i.e. "-text"), it marks the path as
   a non text.

      This attribute is used to mark the path as a text file, which
      needs certain end-of-line normalization upon check-in and
      check-out, or a non-text file, which do not.

  I dunno if the extra verbosity is too much, though.

>  Set::
>  
>  	Setting the `text` attribute on a path enables end-of-line
> -	normalization and marks the path as a text file.  End-of-line
> -	conversion takes place without guessing the content type.
> +	normalization on checkin and checkout as described above.  Line
> +	endings are normalized in the index the next time the file is
> +	checked in, even if the file was previously added to Git with CRLF
> +	line endings.

"the next time" -> "every time", but that is probably moot, because
I would question the need for the whole sentence.

The last sentence, starting with "Line endigns are normalized", may
be redundant and we probably are better off without it.  It would
risk becoming maintenance burden because we have to make sure it
stays in sync with what you wrote in the introductory text.  "as
described above" before that sentence is clear enough.

> @@ -142,10 +144,11 @@ Unset::
>  
>  Set to string value "auto"::
>  
> -	When `text` is set to "auto", the path is marked for automatic
> -	end-of-line conversion.  If Git decides that the content is
> -	text, its line endings are converted to LF on checkin.
> -	When the file has been committed with CRLF, no conversion is done.
> +	When `text` is set to "auto", Git decides by itself whether the file
> +	is text or binary.  If it is text and the file was not already in
> +	Git with CRLF endings, line endings are converted on checkin and
> +	checkout as described above.  Otherwise, no conversion is done on
> +	checkin or checkout.

Nice.

> @@ -162,23 +165,28 @@ unspecified.
>  This attribute sets a specific line-ending style to be used in the
>  working directory.  This attribute has effect only if the `text`
>  attribute is set or unspecified, or if it is set to `auto`, the file is
> -detected as text, and it is stored with LF endings in the index.  Note
> -that setting this attribute on paths which are in the index with CRLF
> -line endings may make the paths to be considered dirty unless
> -`text=auto` is set. Adding the path to the index again will normalize
> -the line endings in the index.
> +detected as text, and it is stored with LF endings in the index.

This level of clarification is very much appreciated.

We may want to use the phrase "checking out" somewhere in
conjunction with "in the working directory" to be consistent with
how the section is titled.  Something like

    This attribute marks a path to use a specific line-ending style
    in the working tree when it is checked out.

>  Set to string value "crlf"::
>  
> +	This setting converts the file's line endings in the working
> +	directory to CRLF when the file is checked out.

Nice.

>  Set to string value "lf"::
>  
> -	This setting forces Git to normalize line endings to LF on
> -	checkin and prevents conversion to CRLF when the file is
> -	checked out.
> +	This setting uses the same line endings in the working directory as
> +	in the index, whether they are LF or CRLF.  However, unless
> +	`text=auto`, adding the file to the index again will normalize its
> +	line endings to LF in the index.

First of all, "uses the same, except if ... then LF is used" is much
harder to understand than "use LF, except if ..." as an explanation
for "lf".

The description may be correct but is 'eol=lf' what controls how the
normalization is done upon checking in?

As you said in the explanation for the `eol` attribute as a whole,
`eol` is effective only when `text` says the path is a text file,
and that (i.e. the fact that `text` controls the end-of-line) makes
the path to use LF line endings, isn't it?  The last sentence
starting with "However, unless" that talks about conversion going
into the index feels out of place.

There are some exception handling in the code for odd cases like the
contents with mixed line endings, a path set to use LF but the file
actually has CRLF, etc.  While they are worth describing, I wonder
if they should be done in a separate paragraph.

> +Unspecified::
> +
> +	If the `eol` attribute is unspecified for a file, its line endings
> +	in the working directory are determined by the `core.autocrlf` or
> +	`core.eol` configuration variable (see the definitions of those
> +	options in linkgit:git-config[1]).  The default if `text` is set but
> +	neither of those variables is is `eol=lf` on Unix and `eol=crlf` on
> +	Windows.

Nice.


[Footnote]

*1* The intent is not about special-casing CRLF but any other; if
    Git were to be extended to support ancient MacOS, we would
    convert even if it uses CR line endings.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux