Hi Atharva On 03/04/2021 14:16, Atharva Raykar wrote:
Add a diff driver for Scheme-like languages which recognizes top level and local `define` forms, whether it is a function definition, binding, syntax definition or a user-defined `define-xyzzy` form. Also supports R6RS `library` forms, `module` forms along with class and struct declarations used in Racket (PLT Scheme). Alternate "def" syntax such as those in Gerbil Scheme are also supported, like defstruct, defsyntax and so on. The rationale for picking `define` forms for the hunk headers is because it is usually the only significant form for defining the structure of the program, and it is a common pattern for schemers to have local function definitions to hide their visibility, so it is not only the top level `define`'s that are of interest. Schemers also extend the language with macros to provide their own define forms (for example, something like a `define-test-suite`) which is also captured in the hunk header. Since it is common practice to extend syntax with variants of a form like `module+`, `class*` etc, those have been supported as well. The word regex is a best-effort attempt to conform to R6RS[1] valid identifiers, symbols and numbers. [1] http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-7.html#node_chap_4 Signed-off-by: Atharva Raykar <raykar.ath@xxxxxxxxx> [...] diff --git a/userdiff.c b/userdiff.c index 3f81a2261c..ac1999bbc5 100644 --- a/userdiff.c +++ b/userdiff.c @@ -191,6 +191,10 @@ PATTERNS("rust", "[a-zA-Z_][a-zA-Z0-9_]*" "|[0-9][0-9_a-fA-Fiosuxz]*(\\.([0-9]*[eE][+-]?)?[0-9_fF]*)?" "|[-+*\\/<>%&^|=!:]=|<<=?|>>=?|&&|\\|\\||->|=>|\\.{2}=|\\.{3}|::"), +PATTERNS("scheme", + "^[\t ]*(\\(((define|def(struct|syntax|class|method|rules|record|proto|alias)?)[-*/ \t]|(library|module|struct|class)[*+ \t]).*)$", + /* All words should be delimited by spaces or parentheses */ + "([^][)(}{[ \t])+"),
I think it would be nice to match single '(' and '[' to highlight when they have been added or deleted - I find this useful when I get a syntax error. Also it would be nice to handle r7rs identifiers like | this is a symbol |. Maybe something like
"(\\|([^\\\\|]*(\\\\|)*)*\\||[^][}{)( \t]|[][(){}])" Best Wishes Phillip
PATTERNS("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$", "[={}\"]|[^={}\" \t]+"), PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",