On Sun Nov 24, 2024 at 03:25, Junio C Hamano <gitster@xxxxxxxxx> wrote: > We have, via the attributes subsystem, a way to choose from a set of > predefined whitespace rules so that "git diff" can notice that you > are adding trailing whitespaces to your newly written lines, or you > are indenting a newly introduced line in a Python script with a HT. > This can be used, for example, in pre-commit hook to reject an > attempt to introduce whitespace-damaging changes to the codebase. > > Which is great. > > I am wondering what we can do to add a different kind of checks to > help file types with fixed format by extending the same mechanism, > or the checks I have in mind are too different from the whitespace > checks and shoehorning it into the existing mechanism does not make > sense. The particular check I have an immediate need for is for a > filetype with lines, each has exactly 4 fields separated with HT in > between, so the check would ask "does each line have exactly 3 HT on > it?" It would be extended to verify CSV files with fixed number of > fields (but the validator needs to be aware of the quoting rules for > comma in a value in fields). > > I guess the best I could do (outside Git) is > > - write such a validator that can take one line of input and say > "this line comforms to the rule". > > - add, via .gitattribute, my own attribute to allow me to mark > the files that these rules apply. Git does not do anything > special for this attribute (remember, I said "outside Git"). > > - in pre-commit hook, run "git diff ':(attr:myattr)'" to grab > changes in these files with special formats, and have the > line-by-line validator (above) check the new lines. > > to make sure bad lines would not slip into the history, but it would > be really nice if I can trigger the check as part of "git diff --check", > which means it would be more ideal if we can do this "inside" Git. > > Perhaps we could introduce a mechansim that allows me to do the > following: > > - An attribute, like whitespace=..., specifies what line-validation > function to use to vet each new line introduced to a file with > the attribute. > > - A line-validation function can be dynamically loaded/linked > (here, we'd need ".gitattribute specifies the logical meaning, > while .git/config and friends maps the 'logical meaning' to a > specific implementation suitable for the platform" separation, > similar to what we use for smudge/clean filters). Perhaps this > would be a good testbed for use of dll, written even in a foreign > language like Rust? > > - In the diff machinery, where a '+' line is checked for whitespace > anomalies in the existing code, add code to call the dynamically > loaded line-validation function when applicable. > > - Profit? > > Hmm? This might be a tangent, but since enhancing whitespace checking was mentioned, I'd thought I note here: `git log --check` running in the CI did not catch the white space errors in this patch (see the last hunk): https://lore.kernel.org/git/20241121225757.3877852-4-bence@xxxxxxxxxxxxxx/ although it would have been certainly nice. I'm not sure if --check could already catch this actually, or if it would be easy/possible to have something general enough that does catch it. Best, Bence