Am 05.04.21 um 12:04 schrieb Phillip Wood: > Hi Atharva > > On 30/03/2021 11:22, Atharva Raykar wrote: >> >> >>> On 30-Mar-2021, at 12:34, Atharva Raykar <raykar.ath@xxxxxxxxx> wrote: >>> >>> >>> >>>> On 29-Mar-2021, at 15:48, Phillip Wood <phillip.wood123@xxxxxxxxx> >>>> wrote: >>>> >>>> Hi Atharva >>>> >>>> On 28/03/2021 13:23, Atharva Raykar wrote: >>>>> On 28-Mar-2021, at 05:16, Johannes Sixt <j6t@xxxxxxxx> wrote: >>>>> [...] >>>>>>> diff --git a/t/t4018/scheme-local-define >>>>>>> b/t/t4018/scheme-local-define >>>>>>> new file mode 100644 >>>>>>> index 0000000000..90e75dcce8 >>>>>>> --- /dev/null >>>>>>> +++ b/t/t4018/scheme-local-define >>>>>>> @@ -0,0 +1,4 @@ >>>>>>> +(define (higher-order) >>>>>>> + (define local-function RIGHT >>>>>> >>>>>> ... this one, which is also indented and *is* marked as RIGHT. >>>>> In this test case, I was explicitly testing for an indented '(define' >>>>> whereas in the former, I was testing for the top-level >>>>> '(define-syntax', >>>>> which happened to have an internal define (which will inevitably >>>>> show up >>>>> in a lot of scheme code). >>>> >>>> It would be nice to include indented define forms but including them >>>> means that any change to the body of a function is attributed to the >>>> last internal definition rather than the actual function. For example >>>> >>>> (define (f arg) >>>> (define (g x) >>>> (+ 1 x)) >>>> >>>> (some-func ...) >>>> ;;any change here will have '(define (g x)' in the hunk header, not >>>> '(define (f arg)' >>> >>> The reason I went for this over the top level forms, is because >>> I felt it was useful to see the nearest definition for internal >>> functions that often have a lot of the actual business logic of >>> the program (at least a lot of SICP seems to follow this pattern). >>> The disadvantage is as you said, it might also catch trivial inner >>> functions and the developer might lose context. >> >> Never mind this message, I had misunderstood the problem you were >> trying to >> demonstrate. I wholeheartedly agree with what you are trying to say, and >> the indentation heuristic discussed does look interesting. I shall have a >> glance at the RFC you linked in the other reply. >> >>> The disadvantage is as you said, it might also catch trivial inner >>> functions and the developer might lose context. >> >> Feel free to disregard me misquoting you here. You did not say that (: >> >>> Another problem is it may match more trivial bindings, like: >>> >>> (define (some-func things) >>> ... >>> (define items '(eggs >>> ham >>> peanut-butter)) >>> ...) >>> >>> What I have noticed *anecdotally* is that this is not common enough >>> to be too much of a problem, and local define bindings seem to be more >>> favoured in Racket than other Schemes, that use 'let' more often. >>> >>>> I don't think this can be avoided as we rely on regexs rather than >>>> parsing the source so it is probably best to only match toplevel >>>> defines. >>> >>> The other issue with only matching top level defines is that a >>> lot of scheme programs are library definitions, something like >>> >>> (library >>> (foo bar) >>> (export ...) >>> (define ...) >>> (define ...) >>> ;; and a bunch of other definitions... >>> ) >>> >>> Only matching top level defines will completely ignore matching all >>> the definitions in these files. >> >> That said, I still stand by the fact that only catching top level defines >> will lead to a lot of definitions being ignored. Maybe the occasional >> mismatch may be worth the gain in the number of function contexts being >> detected? > > I'm not sure that the mismatches will be occasional - every time you > have an internal definition in a function the hunk header will be wrong > when you change the main body of the function. This will affect grep > --function-context and diff -W as well as the normal hunk headers. The > problem is there is no way to avoid that and provide something useful in > the library example you have above. It would be useful to find some code > bases and diff the output of 'git log --patch' with and without the > leading whitespace match in the function pattern to see how often this > is a problem (i.e. when the funcnames do not match see which one is > correct). --function-context is just one application of the function matcher. To work properly with nested function definitions, it would have to understand the nesting. But it does not; there is nothing that we can do about it without a proper language parser. Therefore, the argument that the matcher does not work well with --function-context for nested functions is of little relevance. IMO, the primary concern should be whether the matcher decorates hunk contexts sufficiently well. -- Hannes