On Sun, Apr 03 2022, xing zhi jiang wrote: Aside from what Johannes Sixt mentioned: > +PATTERNS("javascript", > + /* don't match the expression may contain parenthesis, because it is not a function declaration */ > + "!^[ \t]*(if|do|while|for|with|switch|catch|import|return)\n" > + /* don't match statement */ > + "!;\n" > + /* match normal function or named export for function in ECMA2015 */ > + "^((export[\t ]+)?(async[\t ]+)?function[\t ]*[\t *]*[$_[:alpha:]][$_[:alnum:]]*[\t ]*\\(.*)\n" > + /* match JavaScript variable declaration with a lambda expression at top level */ > + "^((const|let|var)[\t ]*[$_[:alpha:]][$_[:alnum:]]*[\t ]*=[\t ]*" > + "(\\(.*\\)|[$_[:alpha:]][$_[:alnum:]]*)[\t ]*=>[\t ]*\\{?)\n" > + /* match object's property assignment by anonymous function and CommonJS exports for named function */ > + "^((module\\.)?[$_[:alpha:]][$_[:alnum:]]*\\.[$_[:alpha:]][$_[:alnum:]]*[\t ]*=[\t ]*(async[\t ]+)?(\\(.*\\)|[$_[:alpha:]][$_[:alnum:]]*)[\t ]*=>.*)\n" > + /* match assign function to LHS with explicit function keyword */ > + "^(.*=[\t ]*function[\t ]*([$_[:alnum:]]+[\t ]*)?\\(.*)\n" > + /* popular unit testing framework test case pattern. Most of framework pattern is match by regex for "function in class" */ > + "^[\t ]*(QUnit.test\\(.*)\n" > + /* don't match the function in class or in object literal, which has more than one ident level */ > + "!^(\t{2,}|[ ]{5,})\n" > + /* match normal function in object literal */ > + "^[\t ]*([$_[:alpha:]][$_[:alnum:]]*[\t ]*:[\t ]*function.*)\n" > + /* don't match chained method call */ > + "!^[\t ]*[$_[:alpha:]][$_[:alnum:]][\t ]*\\(.*\\)\\.\n" > + /* match function in class and ES5 method shorthand */ > + "^[\t ]*((static[\t ]+)?((async|get|set)[\t ]+)?[$_[:alpha:]][$_[:alnum:]]*[\t ]*\\(.*)", > + /* word regex */ > + /* hexIntegerLiteral, octalIntegerLiteral, binaryIntegerLiteral, and its big version */ > + "0[xXoObB][_0-9a-fA-F]+n?" > + /* DecimalLiteral and its big version*/ > + "|[0-9][_0-9]*(\\.[0-9][_0-9]*|n)?([eE][+-]?[_0-9]+)?" > + "|\\.[0-9][_0-9]*([eE][+-]?[_0-9]+)?" > + /* punctuations */ > + "|\\.{3}|<=|>=|==|!=|={3}|!==|\\*{2}|\\+{2}|--|<<|>>" > + "|>>>|&&|\\|{2}|\\?{2}|\\+=|-=|\\*=|%=|\\*{2}=" > + "|<<=|>>=|>>>=|&=|\\|=|\\^=|&&=|\\|{2}=|\\?{2}=|=>" > + /* identifiers */ > + "|[$_[:alpha:]][$_[:alnum:]]*"), > PATTERNS("markdown", > "^ {0,3}#{1,6}[ \t].*", > /* -- */< While we don't use helper macros for these currently there's no reason we can't, I thin the above might be more readable with e.g.: #define JS_AA "[$_[:alpha:]][$_[:alnum:]]" Which would make this: +PATTERNS("javascript", + /* don't match the expression may contain parenthesis, because it is not a function declaration */ + "!^[ \t]*(if|do|while|for|with|switch|catch|import|return)\n" + /* don't match statement */ + "!;\n" + /* match normal function or named export for function in ECMA2015 */ + "^((export[\t ]+)?(async[\t ]+)?function[\t ]*[\t *]*" JS_AA "*[\t ]*\\(.*)\n" + /* match JavaScript variable declaration with a lambda expression at top level */ + "^((const|let|var)[\t ]*" JS_AA "*[\t ]*=[\t ]*" + "(\\(.*\\)|" JS_AA "*)[\t ]*=>[\t ]*\\{?)\n" + /* match object's property assignment by anonymous function and CommonJS exports for named function */ + "^((module\\.)?" JS_AA "*\\." JS_AA "*[\t ]*=[\t ]*(async[\t ]+)?(\\(.*\\)|" JS_AA "*)[\t ]*=>.*)\n" + /* match assign function to LHS with explicit function keyword */ + "^(.*=[\t ]*function[\t ]*([$_[:alnum:]]+[\t ]*)?\\(.*)\n" + /* popular unit testing framework test case pattern. Most of framework pattern is match by regex for "function in class" */ Wry try to stick to wrapping at 80 characters, so some of these comments should really be wrapped (see CodingGuidelines for the multi-line comment style we use). + "^[\t ]*(QUnit.test\\(.*)\n" + /* don't match the function in class or in object literal, which has more than one ident level */ + "!^(\t{2,}|[ ]{5,})\n" + /* match normal function in object literal */ + "^[\t ]*(" JS_AA "*[\t ]*:[\t ]*function.*)\n" + /* don't match chained method call */ + "!^[\t ]*" JS_AA "[\t ]*\\(.*\\)\\.\n" + /* match function in class and ES5 method shorthand */ + "^[\t ]*((static[\t ]+)?((async|get|set)[\t ]+)?" JS_AA "*[\t ]*\\(.*)", + /* word regex */ + /* hexIntegerLiteral, octalIntegerLiteral, binaryIntegerLiteral, and its big version */ + "0[xXoObB][_0-9a-fA-F]+n?" + /* DecimalLiteral and its big version*/ + "|[0-9][_0-9]*(\\.[0-9][_0-9]*|n)?([eE][+-]?[_0-9]+)?" + "|\\.[0-9][_0-9]*([eE][+-]?[_0-9]+)?" + /* punctuations */ + "|\\.{3}|<=|>=|==|!=|={3}|!==|\\*{2}|\\+{2}|--|<<|>>" + "|>>>|&&|\\|{2}|\\?{2}|\\+=|-=|\\*=|%=|\\*{2}=" + "|<<=|>>=|>>>=|&=|\\|=|\\^=|&&=|\\|{2}=|\\?{2}=|=>" + /* identifiers */ + "|" JS_AA "*"), Just a thought, I wonder how much line-noisy we could make this thing in general if we defined some common patterns with such helpers. Anyway, insted of :alnum:and :alpha: don't you really mean [a-zA-Z0-9] and [a-zA-Z]. I.e. do you *really* want to have this different depending on the user's locale? I haven't tested, but see the LC_CTYPE in gettext.c, so I'm fairly sure that'll happen...