Re: [PATCH v7 4/8] regex.3: Improve REG_STARTEND

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

On 4/21/23 04:16, наб wrote:
> Hi!
> 
> On Fri, Apr 21, 2023 at 03:42:48AM +0200, Alejandro Colomar wrote:
>> On 4/21/23 02:39, наб wrote:
>>> Explicitly spell out the ranges involved. The original wording always
>>> confused me, but it's actually very sane.
>>>
>>> Remove "this doesn't change R_NOTBOL & R_NEWLINE" ‒ so does it change
>>> R_NOTEOL? No. That's weird and confusing.
>>>
>>> String largeness doesn't matter, known-lengthness does.
>>>
>>> Explicitly spell out the influence on returned matches
>>> (relative to string, not start of range).
>>>
>>> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@xxxxxxxxxxxxxxxxxx>
>>
>> Patch applied.
>>
>>> ---
>>> Range-diff against v6:
>>> 1:  4b7971a5e < -:  --------- regex.3: Desoupify regfree() description
>>> 2:  5fb4cc16f ! 1:  ed050649b regex.3: Improve REG_STARTEND
>>>     @@ man3/regex.3: .SS Matching
>>>      -and ending before byte
>>>      -.IR pmatch[0].rm_eo .
>>>      +Match
>>>     -+.RI [ string " + " pmatch[0].rm_so ", " string " + " pmatch[0].rm_eo )
>>>     ++.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
>>>      +instead of
>>>     -+.RI [ string ", " string " + \fBstrlen\fP(" string )).
>>>     ++.RI [ string , " string + strlen(string)" ).
>>>       This allows matching embedded NUL bytes
>>>       and avoids a
>>>       .BR strlen (3)
>>>     @@ man3/regex.3: .SS Matching
>>>      +as usual, and the match offsets remain relative to
>>>      +.IR string
>>>      +(not
>>>     -+.IR string " + " pmatch[0].rm_so ).
>>>     ++.IR "string + pmatch[0].rm_so" ).
>>>       This flag is a BSD extension, not present in POSIX.
>>>       .SS Match offsets
>>>       Unless
>>>
>>>  man3/regex.3 | 29 ++++++++++++++++-------------
>>>  1 file changed, 16 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/man3/regex.3 b/man3/regex.3
>>> index 46a4a12b9..099c2c17f 100644
>>> --- a/man3/regex.3
>>> +++ b/man3/regex.3
>>> @@ -131,23 +131,26 @@ .SS Matching
>>>  above).
>>>  .TP
>>>  .B REG_STARTEND
>>> -Use
>>> -.I pmatch[0]
>>> -on the input string, starting at byte
>>> -.I pmatch[0].rm_so
>>> -and ending before byte
>>> -.IR pmatch[0].rm_eo .
>>> +Match
>>> +.RI [ "string + pmatch[0].rm_so" , " string + pmatch[0].rm_eo" )
>>> +instead of
>>> +.RI [ string , " string + strlen(string)" ).
>>>  This allows matching embedded NUL bytes
>>>  and avoids a
>>>  .BR strlen (3)
>>> -on large strings.
>>> -It does not use
>>> +on known-length strings.
>>> +If any matches are returned
>>> +.RB ( REG_NOSUB
>>> +wasn't passed to
>>> +.BR regcomp (),
>>> +the match succeeded, and
>>>  .I nmatch
>>> -on input, and does not change
>>> -.B REG_NOTBOL
>>> -or
>>> -.B REG_NEWLINE
>>> -processing.
>>> +> 0), they overwrite
>>> +.I pmatch
>>> +as usual, and the match offsets remain relative to
>>> +.IR string
>>
>> Minor glitch: s/IR/I/
>>
>> I fixed it.  BTW, don't know if you knew, but you can run some linters
>> to check these accidents by yourself.
> 
> 
> $ make check
> # ...
> GREP    .tmp/man/man1/memusage.1.check-catman.touch
> .tmp/man/man1/memusage.1.cat.grep:132:           Memory usage summary: heap total: 45200, heap peak: 6440, stack peak: 224
> .tmp/man/man1/memusage.1.cat.grep:135:           realloc|        40         44800             0  (nomove:40, dec:19, free:0)
> make: *** [share/mk/check/catman.mk:36: .tmp/man/man1/memusage.1.check-catman.touch] Error 1

That means the line goes beyond the 80-column margin in rendered pages.
There are pages where code examples go beyond that limit, and I can
only live with it :(.  Ideally, that test should pass in every page,
but in some cases it's impossible.

I know the name of the test is horrible.  Feel free to suggest
alternatives.  Maybe something like 'CHECK (80-col)	$@' would do.

> 
> 
> $ make lint
> SED     .tmp/man/man2/add_key.2.d/add_key.c
> LINT (checkpatch)       .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch
> bash: line 1: checkpatch: command not found
> make: *** [share/mk/lint/c.mk:64: .tmp/man/man2/add_key.2.d/add_key.lint-c.checkpatch.touch] Error 127
> 
> git grep checkpatch first says I want checkpatch(1).
> No such manual exists, at least in Debian.

Nope; that manual page probably only exists in my servers :)

<http://www.alejandro-colomar.es/src/alx/linux/checkpatch.git/>

> Then it reveals I actually want checkpatch.pl from a linux checkout.
> Probably call it [scripts/]checkpatch.pl then?

The thing is I suggested (privately; I hate that I can't
reference to some list archive) the checkpatch.pl maintainers
separating checkpatch.pl to a standalone project that can be
packaged separately, and has a separate git history.  That
way it would be directly useful to many other projects that
follow coding styles similar to the kernel.

I prepared some proof of concept in that repo, but we agreed
that it would be better if the entire git history from the
Linux git history was kept, so I need to learn how to extract
a few files from a git repo with their history (I know how to
do that for a single file or directory, but cherry-picking
files is more complex, and I didn't yet look deep into it).

So I need to do that work before trying to host that repo in
<kernel.org>.

Feel free to check out that repo, but keep in mind that I
will rewrite the entire history when I learn how to do it.

> 
> Then it reveals
>   CHECKPATCH              := checkpatch

For me it's in

$ which checkpatch
/usr/local/bin/checkpatch

And it's a modified version to be nicer to non-kernel projects.

> which means that just
>   export CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl
> doesn't work, and I need to pass it as an argument (should be ?=).
> The same for all the other linters.

Yeah; feel free to send patches :)

> 
> $ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint
> # ...
> LINT (mandoc)   .tmp/man/man1/pldd.1.lint-man.mandoc.touch
> mandoc: man1/getent.1:6:14: WARNING: cannot parse date, using it verbatim: (date)
> # (same what feels like every page; bullseye mandoc 1.14.5-1)

If you only want to run $CHECKPATCH, you can run
`make lint-c-checkpatch`.  For a complete set of targets, see
`make help`.  (I know; I should have told you before, but that
way I learnt some stuff that might have passed inadvertently.)

> 
> If I pass MANDOC=~/code/voreutils/mandoc (recent(ish, it was recent last
> year) CVS, + some patches I forgot that fixed some egregious formatting
> errors):
> LINT (mandoc)   .tmp/man/man5/ftpusers.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/gai.conf.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/group.5.lint-man.mandoc.touch
> LINT (mandoc)   .tmp/man/man5/host.conf.5.lint-man.mandoc.touch
> mandoc: man5/erofs.5:78:2: ERROR: skipping end of block that is not open: RE
> mandoc: man5/erofs.5:79:2: WARNING: skipping paragraph macro: IP empty
> mandoc: man5/erofs.5:78:2: WARNING: skipping paragraph macro: br at the end of SS

I see the same errors; feel free to send patches :)

$ make lint check -t
$ touch man5/erofs.5 
$ make lint check -k
LINT (mandoc)	.tmp/man/man5/erofs.5.lint-man.mandoc.touch
mandoc: man5/erofs.5:78:2: ERROR: skipping end of block that is not open: RE
mandoc: man5/erofs.5:79:2: WARNING: skipping paragraph macro: IP empty
mandoc: man5/erofs.5:78:2: WARNING: skipping paragraph macro: br at the end of SS
make: *** [share/mk/lint/man.mk:33: .tmp/man/man5/erofs.5.lint-man.mandoc.touch] Error 1
LINT (tbl comment)	.tmp/man/man5/erofs.5.lint-man.tbl.touch
make: Target 'lint' not remade because of errors.
PRECONV	.tmp/man/man5/erofs.5.tbl
TBL	.tmp/man/man5/erofs.5.eqn
EQN	.tmp/man/man5/erofs.5.cat.troff
TROFF	.tmp/man/man5/erofs.5.cat.grotty
an.tmac:man5/erofs.5:18: style: use of deprecated macro: .PD
an.tmac:man5/erofs.5:24: style: use of deprecated macro: .PD
an.tmac:man5/erofs.5:50: style: .BR expects at least 2 arguments, got 1
an.tmac:man5/erofs.5:78: style: unbalanced .RE
found style problems; aborting
make: *** [share/mk/build/catman.mk:80: .tmp/man/man5/erofs.5.cat.grotty] Error 1
make: *** Deleting file '.tmp/man/man5/erofs.5.cat.grotty'
make: Target 'check' not remade because of errors.


> 
> And it passes!

Do you mean that make doesn't recognize the error?

> Those are the only errors I saw, even on the version with
> IR\ string$
> 
> When I ran with 2>&1 | less to make sure, I got 
> /etc/bash.bashrc: line 7: PS1: unbound variable

So it seems.

> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> /etc/bash.bashrc: line 7: PS1: unbound variable
> SED     .tmp/man/man2/add_key.2.d/add_key.c
> SED     .tmp/man/man2/bind.2.d/bind.c
> SED     .tmp/man/man2/chown.2.d/chown.c
> SED     .tmp/man/man2/clock_getres.2.d/clock_getres.c
> SED     .tmp/man/man2/clone.2.d/clone.c
> SED     .tmp/man/man2/close_range.2.d/close_range.c
> SED     .tmp/man/man2/copy_file_range.2.d/copy_file_range.c
> SED     .tmp/man/man2/eventfd.2.d/eventfd.c
> and indeed
> Makefile:SHELL := /usr/bin/env bash -Eeuo pipefail
> and
> $ sed -n 6,7p /etc/bash.bashrc
> # If not running interactively, don't do anything
> [ -z "$PS1" ] && return

I have the same bashrc (Debian Sid here), and have this same
line.  Why is it failing only for you?  Maybe I modified
something in my startup scripts?  Maybe you did?

> 
> (That should be ${PS1-}. What's even funnier is that

Should we call debbugs?  :)

>  $ sed -n 14p /etc/bash.bashrc
>  if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then)

Huh!

> 
> 
> $ make -j25 CHECKPATCH=~/store/code/linux/scripts/checkpatch.pl lint MANDOC=: CLANG-TIDY=:
> LINT (checkpatch)       .tmp/man/man3/_Generic.3.d/_Generic.lint-c.checkpatch.touch
> ERROR:ASSIGN_IN_IF: do not use assignment in if condition
> #17: FILE: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.c:17:
> +    if ((fp = fopen(argv[1], "r")) == NULL) {
> 
> Do not use assignments in if condition.
> Example::
> 
>   if ((foo = bar(...)) < BAZ) {
> 
> should be written as::
> 
>   foo = bar(...);
>   if (foo < BAZ) {
> 
> total: 1 errors, 0 warnings, 0 checks, 29 lines checked
> make: *** [share/mk/lint/c.mk:64: .tmp/man/man3const/EXIT_SUCCESS.3const.d/EXIT_SUCCESS.lint-c.checkpatch.touch] Error 1
> make: *** Waiting for unfinished jobs....

Hmmm, yes, I see that same error; this page is recent, so I
probably never run the linters on it yet.  :/

Thanks for the catch!  Fixed.

> CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
> #17: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:17:
> +    printf("Name: \"%s\" (%d segments)\n", info->dlpi_name,
> +               info->dlpi_phnum);

This page has so many warnings, that I probably missed these
valid ones.  Alignment seems performed by a schoolchild that
can't follow lines while painting :p.  Fixed.

> 
> CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
> #33: FILE: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.c:33:
> +        printf("    %2zu: [%14p; memsz:%7jx] flags: %#jx; ", j,
> +                (void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr),
> 
> total: 0 errors, 0 warnings, 2 checks, 54 lines checked
> make: *** [share/mk/lint/c.mk:63: .tmp/man/man3/dl_iterate_phdr.3.d/dl_iterate_phdr.lint-c.checkpatch.touch] Error 1
> WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
> #230: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:230:
> +        err(EXIT_FAILURE, "closeSocketPair-close-0");
> 
> Embedded function names are less appropriate to use as
> refactoring can cause function renaming.  Prefer the use of
> "%s", __func__ to embedded function names.
> 
> Note that this does not work with -f (--file) checkpatch option
> as it depends on patch context providing the function name.
> 
> WARNING:EMBEDDED_FUNCTION_NAME: Prefer using '"%s...", __func__' to using 'closeSocketPair', this function's name, in a string
> #232: FILE: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.c:232:
> +        err(EXIT_FAILURE, "closeSocketPair-close-1");

I've seen this one, and thought of fixing it, but I'm not
yet sure how to do it so that the page is consistent with
itself.  So far I've not done anything.

> 
> total: 0 errors, 2 warnings, 0 checks, 612 lines checked
> make: *** [share/mk/lint/c.mk:63: .tmp/man/man2/seccomp_unotify.2.d/seccomp_unotify.lint-c.checkpatch.touch] Error 1
> 
> (more pages)
> 
> 
> I'm not sure I agree with the ASSIGN_IN_IF case,

I do agree with it; it's just that I don't run these often;
especially some linters that have many warnings in current
pages, I tend to ignore them.  But they're still useful
sometimes.

> but I'm assuming
> there's a mechanism to kill the lints you don't are about;
> linux cdc9718d5e590d6905361800b938b93f2b66818e.

I disable the lints in the Makefile, so whatever you see is
probably because it's a wanted warning, or because the
linter recently added it.  However, fixing all pages would
be impossible :(.

> 
> 
> This continues until I've disabled every linter.
> I'm assuming you have specific versions that work for you,
> but, well.

No; I do see a lot of noise too.  The thing is it's still
useful for linting specific pages:

    $ make lint check -t >/dev/null  # ignore everything
    $ make lint check -W man5/erofs.5  # lint only that page

Cheers,
Alex

> 
> 
> Best,
> наб

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux