Re: [PATCH v2] docs: rewrite the documentation of the text and eol attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 27, 2023 at 10:22:21PM -0600, Alex Henrie wrote:

Thanks for picking this up.
There had been some comments from Junio, I haven't had time to look at
them yet.
Some of my comments inline, lets see how we can converge things.

> These two sentences are confusing because the description of the text
> attribute sounds exactly the same as the description of the text=auto
> attribute:
>
> "Setting the text attribute on a path enables end-of-line normalization"
>
> "When text is set to "auto", the path is marked for automatic
> end-of-line conversion"
>
> Unless the reader is already familiar with the two variants, there's a
> high probability that they will think that "end-of-line normalization"
> is the same thing as "automatic end-of-line conversion".
>
> It's also not clear that the phrase "When the file has been committed
> with CRLF, no conversion is done" in the paragraph for text=auto does
> not apply equally to the bare text attribute which is described earlier.
> Moreover, it falsely implies that normalization is only suppressed if
> the file has been committed. In fact, running `git add` on a CRLF file,
> adding the text=auto attribute to the file, and running `git add` again
> does not do anything to the line endings either.
>
> On top of that, in several places the documentation for the eol
> attribute sounds like it can force normalization on checkin and checkout
> all by itself, but eol doesn't control normalization on checkin and
> doesn't control normalization on checkout either unless accompanied by
> the text attribute.
>
> Rephrase the documentation of text, text=auto, eol, eol=crlf, and eol=lf
> to be clear about how they are the same, how they are different, and in
> what cases normalization is performed.
>
> Signed-off-by: Alex Henrie <alexhenrie24@xxxxxxxxx>
> ---
> v2: rewrite completely and rewrite the eol documentation too
> ---
>  Documentation/gitattributes.txt | 58 +++++++++++++++++++--------------
>  1 file changed, 33 insertions(+), 25 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 39bfbca1ff..bcea84f439 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -120,20 +120,22 @@ repository upon 'git add' and 'git commit'.
>  `text`
>  ^^^^^^
>
> -This attribute enables and controls end-of-line normalization.  When a

Hm, not only, I think. The terminologie is probably not very well specified.
I would say that it "controls end-of-line conversion".
There are 2 type of conversions, from CRLF into LF and LF into CRLF.
The CRLF -> LF conversion happens only at `git commit`
(strictly speaking already at `git add`) and is called normalization.
Because in Git a "normalized" file has LF in the repo (and index).
The term normalize has even been add to 2 commands:

`git add --renormalize .`
inspired by
`git merge -Xrenormalize`

> -text file is normalized, its line endings are converted to LF in the
> -repository.  To control what line ending style is used in the working
> -directory, use the `eol` attribute for a single file and the
> -`core.eol` configuration variable for all text files.

In general, the `eol` attribute can be used by more that a single file.
And that is what we (or at least myself) recommend to do, in a kind
of best practice fashion, as an example:

*.sh text eol=lf

The core.eol is the fallback, when the eol attribute is not specified.
But the we look at core.autocrlf, before looking at core.eol,
as pointed out below.
And if none of them is set, Git uses the platform native setting,
whih is CRLF for Windows and LF for all other systems.


> -Note that setting `core.autocrlf` to `true` or `input` overrides
> -`core.eol` (see the definitions of those options in

Looking with fresh eyes: I am not sure if like this historical construct.
First we say the the "core.eol" sets the line endings (if not defined in
the attribute) and the we say that core.autcrlf overrides core.eol

This is mainly due to historically resons.
I think that things goes like this:
When text or text=auto (and Git identifies the file as text),
and the eol attribute is not set, then:
core.autocrlf=true gives CRLF
core.autocrlf=input give LF
core.autocrlf=false looks at core.eol:
core.eol=clrf gives CRLF
core.eol=lf give LF
core.eol unset gives the platform default

> -linkgit:git-config[1]).
> +This attribute marks the path as a text file, which enables end-of-line
> +normalization on checkin and possibly also checkout: When a matching
As said before.
 ...normalization on checkin and possibly conversion at checkout...
 or
  ... conversion on checkin and possibly also checkout...

> +file is added to the index, even if it has CRLF line endings in the
> +working directory, the file is stored in the index with LF line endings.
> +Conversely, when the file is copied from the index to the working
"copied" is not an ideal word here:
We may specify a filter and/or an encoding as well.
Would "transferred and possibly filtered/encoded" be better ?

> +directory, its line endings may be converted from LF to CRLF depending
> +on the `eol` attribute, the Git config, and the platform (see
> +explanation of `eol` below).
>
>  Set::
>
>  	Setting the `text` attribute on a path enables end-of-line
> -	normalization and marks the path as a text file.  End-of-line
> -	conversion takes place without guessing the content type.
> +	normalization on checkin and checkout as described above.  Line
> +	endings are normalized in the index the next time the file is
> +	checked in, even if the file was previously added to Git with CRLF
> +	line endings.
>
>  Unset::
>
> @@ -142,10 +144,11 @@ Unset::
>
>  Set to string value "auto"::
>
> -	When `text` is set to "auto", the path is marked for automatic
> -	end-of-line conversion.  If Git decides that the content is
> -	text, its line endings are converted to LF on checkin.
> -	When the file has been committed with CRLF, no conversion is done.
> +	When `text` is set to "auto", Git decides by itself whether the file
> +	is text or binary.  If it is text and the file was not already in
> +	Git with CRLF endings, line endings are converted on checkin and
> +	checkout as described above.  Otherwise, no conversion is done on
> +	checkin or checkout.

Side note: We previously talked about files. path is better.
>
>  Unspecified::
>
> @@ -162,23 +165,28 @@ unspecified.
>  This attribute sets a specific line-ending style to be used in the
>  working directory.  This attribute has effect only if the `text`
>  attribute is set or unspecified, or if it is set to `auto`, the file is
> -detected as text, and it is stored with LF endings in the index.  Note
> -that setting this attribute on paths which are in the index with CRLF
> -line endings may make the paths to be considered dirty unless
> -`text=auto` is set.
... Or `git add --renormalize <path> is run.

Adding the path to the index again will normalize
> -the line endings in the index.
> +detected as text, and it is stored with LF endings in the index.
>
>  Set to string value "crlf"::
>
> -	This setting forces Git to normalize line endings for this
> -	file on checkin and convert them to CRLF when the file is
> -	checked out.
> +	This setting converts the file's line endings in the working
> +	directory to CRLF when the file is checked out.
>
>  Set to string value "lf"::
>
> -	This setting forces Git to normalize line endings to LF on
> -	checkin and prevents conversion to CRLF when the file is
> -	checked out.
> +	This setting uses the same line endings in the working directory as
> +	in the index, whether they are LF or CRLF.  However, unless
> +	`text=auto`, adding the file to the index again will normalize its
> +	line endings to LF in the index.
> +
> +Unspecified::
> +
> +	If the `eol` attribute is unspecified for a file, its line endings
> +	in the working directory are determined by the `core.autocrlf` or
> +	`core.eol` configuration variable (see the definitions of those
> +	options in linkgit:git-config[1]).  The default if `text` is set but
> +	neither of those variables is is `eol=lf` on Unix and `eol=crlf` on
> +	Windows.
>
>  Backwards compatibility with `crlf` attribute
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> --
> 2.40.0
>

Thanks again for working on this.
/Torsten




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux