Re: [PATCH v3 1/3] clang-format: re-adjust line break penalties

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 11, 2024 at 6:50 PM Karthik Nayak <karthik.188@xxxxxxxxx> wrote:
>
> In 42efde4c29 (clang-format: adjust line break penalties, 2017-09-29) we
> adjusted the line break penalties to really fine tune what we care about
> while doing line breaks. Modify some of those to be more inline with
> what we care about in the Git project now.
>
> We need to understand that the values set to penalties in
> '.clang-format' are relative to each other and do not hold any absolute
> value. The penalty arguments take an 'Unsigned' value, so we have some
> liberty over the values we can set.
>
> First, in that commit, we decided, that under no circumstances do we
> want to exceed 80 characters. This seems a bit too strict. We do
> overshoot this limit from time to time to prioritize readability.

I think that attempting to get the weights right so as to avoid cases
where there was an intentional affordance for readability is going to
be essentially impossible. Areas where there's an intentional
disregard for the clang-format-generated formatting should disable the
formatter for that line/region, instead of trying to find a way to
adjust the rules to produce something that's going to end up being
context dependent.

Example: In ref-filter.c, there's 13 lines when initializing the
`valid_atom` array that are >80 characters, and 20 lines that are >80
columns (when using 8-space tabs). Line breaking in that block of code
may be undesirable, so just disable clang-format there. I don't think
there's a consistent set of penalties you could establish that would
handle that well without mishandling some other section of code.

It's also not clear what the reason for the overshoot is in many cases.
- difference between "80 characters" and "80 columns"?
    - (1394 >80char lines in *.{h,c}, 4849 >80col lines in the same files)
- intentional for readability?
- refactorings pushed originally compliant lines out of compliance?
- no one caught it and it was just added without any intentional decision?

> So
> let's reduce the value for 'PenaltyExcessCharacter' to 10. This means we
> that we add a penalty of 10 for each character that exceeds the column
> limit. By itself this is enough to restrict to column limit. Tuning
> other penalties in relation to this is what is important.
>
> The penalty `PenaltyBreakAssignment` talks about the penalty for
> breaking an assignment operator on to the next line. In our project, we
> are okay with this, so giving a value of 5, which is below the value for
> 'PenaltyExcessCharacter' ensures that in the end, even 1 character over
> the column limit is not worth keeping an assignment on the same line.
>
> Similarly set the penalty for breaking before the first call parameter
> 'PenaltyBreakBeforeFirstCallParameter' and the penalty for breaking
> comments 'PenaltyBreakComment' and the penalty for breaking string
> literals 'PenaltyBreakString' also to 5.
>
> Finally, we really care about not breaking the return type into its own
> line and we really care about not breaking before an open parenthesis.
> This avoids weird formatting like:
>
>    static const struct strbuf *
>           a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)

Is this how it'd be indented without the penalties, or would it do
this, with the function name indented the same amount as the return
type (which is, in C, probably going to be the 0th column most times):

static const struct strbuf *
a_really_really_large_function_name(struct strbuf resolved,
        const char *path, int flags)

>
> or
>
>    static const struct strbuf *a_really_really_large_function_name(
>             struct strbuf resolved, const char *path, int flags)

Personal opinion: I prefer this over the version that has a single
argument on the first line. My preference for reading functions is:

return_type func_name(arg1, arg2,
                      arg3, arg4,
                      arg5, arg6, ...);

Or

return_type func_name(
        arg1, arg2, arg3, arg4,
        arg5, arg6, ...);

or, in some cases, putting every argument on their own line (typically
when the majority of the arguments are already on their own line, not
having one "hiding" somewhere is preferable, but at this point if
that's not what my formatter does, I don't fight it).

For functions that accept an obvious first parameter, such as
`strbuf_add`, maybe having the first parameter on the first line is
acceptable/desirable, since it's "obvious" what it is/does. But for
many functions that's not the case, and needing to read the end of the
first line, potentially beyond the 80th column, feels weird.

>
> to instead have something more readable like:
>
>    static const struct strbuf *a_really_really_large_function_name(struct strbuf resolved,
>           const char *path, int flags)
>
> This is done by bumping the values of 'PenaltyReturnTypeOnItsOwnLine'
> and 'PenaltyBreakOpenParenthesis' to 300. This is so that we can allow a
> few characters above the 80 column limit to make code more readable.

A few examples, such as by formatting the code using the current rules
(since much of the codebase does not currently comply), and then
changing the penalties and seeing what changes, might be nice?


>
> Signed-off-by: Karthik Nayak <karthik.188@xxxxxxxxx>
> ---
>  .clang-format | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/.clang-format b/.clang-format
> index 41969eca4b..66a2360ae5 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -209,13 +209,14 @@ KeepEmptyLinesAtTheStartOfBlocks: false
>
>  # Penalties
>  # This decides what order things should be done if a line is too long
> -PenaltyBreakAssignment: 10
> -PenaltyBreakBeforeFirstCallParameter: 30
> -PenaltyBreakComment: 10
> +PenaltyBreakAssignment: 5
> +PenaltyBreakBeforeFirstCallParameter: 5
> +PenaltyBreakComment: 5
>  PenaltyBreakFirstLessLess: 0
> -PenaltyBreakString: 10
> -PenaltyExcessCharacter: 100
> -PenaltyReturnTypeOnItsOwnLine: 60
> +PenaltyBreakOpenParenthesis: 300

How does this interact with PenaltyBreakBeforeFirstCallParameter? Does
one override the other?

> +PenaltyBreakString: 5
> +PenaltyExcessCharacter: 10
> +PenaltyReturnTypeOnItsOwnLine: 300
>
>  # Don't sort #include's
>  SortIncludes: false
> --
> 2.47.0
>





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux