My second week blog finished: The web version is here: https://adlternative.github.io/GSOC-Git-Blog-2/ ------- ## Week2: learning the slang of a new city ### What happened this week - In [[PATCH 1/2] [GSOC] ref-filter: add %(raw) atom](https://lore.kernel.org/git/b3848f24f2d3f91fc96f20b5a08cbfbe721acbd6.1622126603.git.gitgitgadget@xxxxxxxxx/), I made a license-related mistake this week. When I was implementing `%(raw)` atom for ref-filter, I noticed that `glibc` did not provide us with `memcasecmp()` which can be used to compare two pieces of memory and ignore case, so I found `memcasecmp()` implemented by `gnulib` on the Internet, and copy it to git to use. But unfortunately, I should not copy it so "conveniently". Git use `gpl-v2` and `gunlib` use `gpl-v3`, they are incompatible. Since I used to write code for my own use, I am not very sensitive to licenses problems. Thanks to `Felipe Contreras` for correcting me. Therefore, from today onwards, I will be more careful about the license. - On the other hand, I realized that clean code is also a very important thing. In `cmp_ref_sorting()`, I want to use `memcmp()/memcasecmp()` to compare two strings. BAD VERSION: ```c int (*cmp_fn)(const void *, const void *, size_t); cmp_fn = s->sort_flags & REF_SORTING_ICASE ? memcasecmp : memcmp; if (va->s_size != ATOM_VALUE_S_SIZE_INIT && vb->s_size != ATOM_VALUE_S_SIZE_INIT) { cmp = cmp_fn(va->s, vb->s, va->s_size > vb->s_size ? vb->s_size : va->s_size); } else if (va->s_size == ATOM_VALUE_S_SIZE_INIT) { slen = strlen(va->s); cmp = cmp_fn(va->s, vb->s, slen > vb->s_size ? vb->s_size : slen); } else { slen = strlen(vb->s); cmp = cmp_fn(va->s, vb->s, slen > va->s_size ? slen : va->s_size); } cmp = cmp ? cmp : va->s_size - vb->s_size; } ``` It's complicated and buggy. GOOD VERSION: ```c int (*cmp_fn)(const void *, const void *, size_t); cmp_fn = s->sort_flags & REF_SORTING_ICASE ? memcasecmp : memcmp; size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ? strlen(va->s) : va->s_size; size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ? strlen(vb->s) : vb->s_size; cmp = cmp_fn(va->s, vb->s, b_size > a_size ? a_size : b_size); if (!cmp) { if (a_size > b_size) cmp = 1; else if (a_size < b_size) cmp = -1; } ``` It's relatively refreshing a lot. So how to cultivate a good coding style? As `Felipe Contreras` said: "It's like learning the slang of a new city; it takes a while." ### What's the next step There are still some flaws in the %(raw) implementation, but let's look ahead and see what we can do. We check the atoms with `verify_ref_format()` and pass object oid and grub corresponding object data through `format_ref_array_item()`: |Git command|Atoms| |-|-| |`git cat-file --batch-check` | `%(objectname) %(objecttype) %(objectsize)`| |`git cat-file --batch --symlink`| `%(objectname) %(objecttype) %(objectsize)`| |`git cat-file --batch` | `%(objectname) %(objecttype) %(objectsize)\n%(raw)`| |`git cat-file --batch --textconv` | `%(objectname) %(objecttype) %(objectsize)\n%(raw:textconv)`| |`git cat-file --batch --filter` | `%(objectname) %(objecttype) %(objectsize)\n%(raw:filter)`| |`git cat-file --batch="%(rest)"` | `%(rest)`| No additional operation is required in `git cat-file --batch --symlink`, because `get_oid_with_context(...,GET_OID_FOLLOW_SYMLINKS,...)` can help us do that. I have leave the rough implementation here: [adlternative:cat-file-temp](https://github.com/gitgitgadget/git/compare/master...adlternative:cat-file-temp). its performance is 25% worse than before. Rather than posting it to the mailing list, because I still need to implement the previous dependencies step by step... Thanks! -- ZheNing Hu