Re: [GSOC][RFC] Add more builtin patterns for userdiff, as Mircroproject.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sergius,

On Tue, Jan 9, 2024 at 8:59 PM Sergius Nyah <sergiusnyah@xxxxxxxxx> wrote:
>
> Hello everyone,
> I'm Sergius, a Computer Science undergraduate student, and I want to
> begin Contributing to the Git project. So far, I've gone through
> Matheus' tutorial on First steps Contributing to Git, and I found it
> very helpful. I've also read the Contribution guidelines keenly and
> built Git from source.

Thanks for your interest in contributing to Git!

> In accordance to the contributor guidelines, I came across this
> Mircoproject idea from: https://git.github.io/SoC-2022-Microprojects/

s/Mircoproject/microproject/

There is a similar typo in the subject of your email too.

> which I'm willing to work on. It talked about enhancing Git's
> "userdiff" feature in "userdiff.c" which is crucial for identifying
> function names in various programming languages, thereby improving the
> readability of "git diff" outputs.
>
> From my understanding, the project involves extending the `userdiff`
> feature to support additional programming languages that are currently
> not covered such as Shell, Swift, Go and the others.

As far as I can see in userdiff.c, Golang and Bash seem to be supported.

> Here is a sample of how a language is defined in `userdiff.c`:
>
> > #define PATTERNS(lang, rx, wrx) { \
> > .name = lang, \
> > .binary = -1, \
> > .funcname = { \
> > .pattern = rx, \
> > .cflags = REG_EXTENDED, \
> > }, \
> > .word_regex = wrx "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+", \
> > .word_regex_multi_byte = wrx "|[^[:space:]]", \
> > }
>
> In this code, `lang` is the name of the language, `rx` is the regular
> expression for identifying function names, and `wrx` is the word
> regex.
>
> Approach: I Identified the Programming Languages that are not
> currently supported by the userdiff feature by reviewing the existing
> patterns in userdiff.c and comparing them with some popular
> programming languages.
> For each supported language, I would define a regular expression that
> could help identify function names in that language. This could
> include researching each language's syntax and testing their
> expressions to ensure that they work well.

In your microproject, you only need to add support for ONE language
that is not supported yet. Please don't try to do more than that.

> Also, I'd add a new IPATTERN definition for each language to the
> "userdiff.c" file, then rebuild Git and test the changes by creating a
> repo with files in the newly supported languages then run "git diff"
> to ensure the line @@ ... @@ produces their correct function names.
> Then submit a patch.

Except for my comments above, this looks like a good plan. Thanks.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux