Re: Clean up stale .gitignore and .gitattribute patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Peff for the suggestion. I ended up scripting something via
JGit [1], as we're anyway using it as part of our Gradle build system.

PS: As a future idea, it might be good if "git mv" gives a hint about
updating .gitattributes if files matching .gitattributes pattern are
moved.

[1]: https://github.com/oss-review-toolkit/ort/pull/7195/commits/e01945d41012db2d0bc2e53d7be4abd513888ba6

-- 
Sebastian Schuberth

On Sat, Jun 24, 2023 at 3:12 AM Jeff King <peff@xxxxxxxx> wrote:
>
> On Fri, Jun 23, 2023 at 05:29:42PM +0200, Sebastian Schuberth wrote:
>
> > is there a command to easily check patterns in .gitignore and
> > .gitattributes to still match something? I'd like to remove / correct
> > patterns that don't match anything anymore due to (re)moved files.
>
> I don't think there's a solution that matches "easily", but you can do a
> bit with some scripting. See below.
>
> For checking .gitignore, I don't think you can ever say (at the git
> level) that a certain pattern is useless, because it is inherently about
> matching things that not tracked, and hence generated elsewhere. So if
> you have a "*.foo" pattern, you can check if it matches anything
> _currently_ in your working tree, but if it doesn't that may mean that
> you simply did not trigger the build rule that makes the garbage ".foo"
> file.
>
> So with that caveat, we can ask Git which rules _do_ have a match, and
> then eliminate them as "definitely useful", and print the others. The
> logic is sufficiently tricky that I turned to perl:
>
> -- >8 show-unmatched-ignore.pl 8< --
> #!/usr/bin/perl
>
> # The general idea here is to read "filename:linenr ..." output from
> # "check-ignore -v". For each filename we learn about, we'll load the
> # complete set of lines into an array and then "cross them off" as
> # check-ignore tells us they were used.
> #
> # Note that we'd fail to mention an ignore file which matches nothing.
> # Probably the list of filenames could be generated independently. I'll
> # that as an exercise for the reader.
> while (<>) {
>   /^(.*?):(\d+):/
>     or die "puzzling input: $_";
>   if (!defined $files{$1}) {
>     $files{$1} = do {
>       open(my $fh, '<', $1)
>         or die "unable to open $1: $!";
>       [<$fh>]
>     };
>   }
>   $files{$1}->[$2] = undef;
> }
>
> # With that done, whatever is left is unmatched. Print them.
> for my $fn (sort keys(%files)) {
>   my $lines = $files{$fn};
>   for my $nr (1..@$lines) {
>     my $line = $lines->[$nr-1];
>     print "$fn:$nr $line" if defined $line;
>   }
> }
> -- >8 --
>
> And you'd use it something like:
>
>   git ls-files -o |
>   git check-ignore --stdin -v |
>   perl show-unmatched-ignore.pl
>
> Pretty clunky, but it works OK in git.git (and shows that there are many
> "not matched but probably still useful" entries; e.g., "*.dll" will
> never match for me on Linux, but is probably something we still want to
> keep). So I wouldn't use it as an automated tool, but it might give a
> starting point for a human looking to clean things up manually.
>
> For attributes, I think the situation is better; we only need them to
> match tracked files (though technically speaking, you may want to keep
> attributes around for historical files as we use the checked-out
> attributes during "git log", etc). Unfortunately we don't have an
> equivalent of "-v" for check-attr. It might be possible to add that ,but
> in the meantime, the best I could come up with is to munge each pattern
> to add a sentinel attribute, and see if it matches anything.
>
> Something like:
>
>   # Maybe also pipe in .git/info/attributes and core.attributesFile
>   # if you want to check those.
>   git ls-files '.gitattributes' '**/.gitattributes' |
>   while read fn; do
>         lines=$(wc -l <"$fn")
>         mv "$fn" "$fn.orig"
>         nr=1
>         while test $nr -le $lines; do
>                 sed "${nr}s/$/ is-matched/" <"$fn.orig" >"$fn"
>                 git ls-files | git check-attr --stdin is-matched |
>                 grep -q "is-matched: set" ||
>                 echo "$fn:$nr $(sed -n ${nr}p "$fn.orig")"
>                 nr=$((nr+1))
>         done
>         mv "$fn.orig" "$fn"
>   done
>
> It produces no output in git.git (we are using all of our attributes),
> but you can add a useless one like:
>
>   echo '*.c -diff' >>Documentation/.gitattributes
>
> and then the loop yields:
>
>   Documentation/.gitattributes:2 *.c -diff
>
> So I definitely wouldn't call any of that "easy", but it may help you.
>
> -Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux