Thanks Peff for the suggestion. I ended up scripting something via JGit [1], as we're anyway using it as part of our Gradle build system. PS: As a future idea, it might be good if "git mv" gives a hint about updating .gitattributes if files matching .gitattributes pattern are moved. [1]: https://github.com/oss-review-toolkit/ort/pull/7195/commits/e01945d41012db2d0bc2e53d7be4abd513888ba6 -- Sebastian Schuberth On Sat, Jun 24, 2023 at 3:12 AM Jeff King <peff@xxxxxxxx> wrote: > > On Fri, Jun 23, 2023 at 05:29:42PM +0200, Sebastian Schuberth wrote: > > > is there a command to easily check patterns in .gitignore and > > .gitattributes to still match something? I'd like to remove / correct > > patterns that don't match anything anymore due to (re)moved files. > > I don't think there's a solution that matches "easily", but you can do a > bit with some scripting. See below. > > For checking .gitignore, I don't think you can ever say (at the git > level) that a certain pattern is useless, because it is inherently about > matching things that not tracked, and hence generated elsewhere. So if > you have a "*.foo" pattern, you can check if it matches anything > _currently_ in your working tree, but if it doesn't that may mean that > you simply did not trigger the build rule that makes the garbage ".foo" > file. > > So with that caveat, we can ask Git which rules _do_ have a match, and > then eliminate them as "definitely useful", and print the others. The > logic is sufficiently tricky that I turned to perl: > > -- >8 show-unmatched-ignore.pl 8< -- > #!/usr/bin/perl > > # The general idea here is to read "filename:linenr ..." output from > # "check-ignore -v". For each filename we learn about, we'll load the > # complete set of lines into an array and then "cross them off" as > # check-ignore tells us they were used. > # > # Note that we'd fail to mention an ignore file which matches nothing. > # Probably the list of filenames could be generated independently. I'll > # that as an exercise for the reader. > while (<>) { > /^(.*?):(\d+):/ > or die "puzzling input: $_"; > if (!defined $files{$1}) { > $files{$1} = do { > open(my $fh, '<', $1) > or die "unable to open $1: $!"; > [<$fh>] > }; > } > $files{$1}->[$2] = undef; > } > > # With that done, whatever is left is unmatched. Print them. > for my $fn (sort keys(%files)) { > my $lines = $files{$fn}; > for my $nr (1..@$lines) { > my $line = $lines->[$nr-1]; > print "$fn:$nr $line" if defined $line; > } > } > -- >8 -- > > And you'd use it something like: > > git ls-files -o | > git check-ignore --stdin -v | > perl show-unmatched-ignore.pl > > Pretty clunky, but it works OK in git.git (and shows that there are many > "not matched but probably still useful" entries; e.g., "*.dll" will > never match for me on Linux, but is probably something we still want to > keep). So I wouldn't use it as an automated tool, but it might give a > starting point for a human looking to clean things up manually. > > For attributes, I think the situation is better; we only need them to > match tracked files (though technically speaking, you may want to keep > attributes around for historical files as we use the checked-out > attributes during "git log", etc). Unfortunately we don't have an > equivalent of "-v" for check-attr. It might be possible to add that ,but > in the meantime, the best I could come up with is to munge each pattern > to add a sentinel attribute, and see if it matches anything. > > Something like: > > # Maybe also pipe in .git/info/attributes and core.attributesFile > # if you want to check those. > git ls-files '.gitattributes' '**/.gitattributes' | > while read fn; do > lines=$(wc -l <"$fn") > mv "$fn" "$fn.orig" > nr=1 > while test $nr -le $lines; do > sed "${nr}s/$/ is-matched/" <"$fn.orig" >"$fn" > git ls-files | git check-attr --stdin is-matched | > grep -q "is-matched: set" || > echo "$fn:$nr $(sed -n ${nr}p "$fn.orig")" > nr=$((nr+1)) > done > mv "$fn.orig" "$fn" > done > > It produces no output in git.git (we are using all of our attributes), > but you can add a useless one like: > > echo '*.c -diff' >>Documentation/.gitattributes > > and then the loop yields: > > Documentation/.gitattributes:2 *.c -diff > > So I definitely wouldn't call any of that "easy", but it may help you. > > -Peff