Hi, Maksym I haven't read the entire patch (and I don't normally use the ident feature), but I left a few comments below: On Thu, Aug 26, 2021 at 1:28 AM Maksym Sobolyev via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt > index 83fd4e19a4..9e486f3e8d 100644 > --- a/Documentation/gitattributes.txt > +++ b/Documentation/gitattributes.txt > @@ -382,6 +382,14 @@ sign `$` upon checkout. Any byte sequence that begins with > `$Id:` and ends with `$` in the worktree file is replaced > with `$Id$` upon check-in. > > +The `ident` attribute can also provide an optional value, > +which if supplied is going to be used for expansion instead of > +the string `Id`. > + > +------------------------ > +*.[ch] ident=FreeBSD > +------------------------ What happens if there is a ':' or '$' in the custom id name? $ echo 'f ident=$weird:$' >.gitattributes $ echo '$$weird:$$' >f $ git add . $ git commit -m files $ rm f $ git checkout f $ cat f $$weird:$: 907f73b343505bf289a4a25e41ea91d90250fedb $ Seem right. Now let's see the cleaning direction: $ echo foo >> f $ git add f $ git cat-file -p :f $$weird:$: 907f73b343505bf289a4a25e41ea91d90250fedb $ foo Hmm, this one is not right. I think I know the cause: > diff --git a/convert.c b/convert.c > index 0d6fb3410ae..1e8940bf9d7 100644 > --- a/convert.c > +++ b/convert.c > @@ -1122,17 +1127,18 @@ static int ident_to_git(const char *src, size_t len, > len -= dollar + 1 - src; > src = dollar + 1; > > - if (len > 3 && !memcmp(src, "Id:", 3)) { > - dollar = memchr(src + 3, '$', len - 3); > + if (len > idact->id_len + 1 && !memcmp(src, idact->id, idact->id_len) && src[idact->id_len + 1] == ':') { Shouldn't this be `... && src[idact->id_len] == ':')`? I.e. without the "+ 1". > diff --git a/convert.h b/convert.h > index 5ee1c322058..a02632e8104 100644 > --- a/convert.h > +++ b/convert.h > @@ -76,11 +76,16 @@ enum convert_crlf_action { > > struct convert_driver; > > +struct ident_action { > + const char *id; > + int id_len; I known the string size is limited to GIT_MAX_IDENT_LEN, but perhaps it would be better to future-proof this and use `size_t` (or at least `unsigned int`). > diff --git a/parallel-checkout.c b/parallel-checkout.c > index ddc0ff3c064..f9a3f2ff25b 100644 > --- a/parallel-checkout.c > +++ b/parallel-checkout.c > @@ -403,13 +403,15 @@ static void send_one_item(int fd, struct parallel_checkout_item *pc_item) > size_t name_len = pc_item->ce->ce_namelen; > size_t working_tree_encoding_len = working_tree_encoding ? > strlen(working_tree_encoding) : 0; > + const char *ident_action_id = pc_item->ca.ident_action.id; > + size_t ident_action_len = pc_item->ca.ident_action.id_len; > > /* > * Any changes in the calculation of the message size must also be made > * in is_eligible_for_parallel_checkout(). > */ > len_data = sizeof(struct pc_item_fixed_portion) + name_len + > - working_tree_encoding_len; > + working_tree_encoding_len + ident_action_len; This update in the packet size calculation should also be reflected in is_eligible_for_parallel_checkout(): packed_item_size = sizeof(struct pc_item_fixed_portion) + ce->ce_namelen + (ca->working_tree_encoding ? strlen(ca->working_tree_encoding) : 0); > diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh > index 25254579618..822957a8dc8 100755 > --- a/t/t2082-parallel-checkout-attributes.sh > +++ b/t/t2082-parallel-checkout-attributes.sh > @@ -20,16 +20,19 @@ test_expect_success 'parallel-checkout with ident' ' > ( > cd ident && > echo "A ident" >.gitattributes && > + echo "C ident=MyCusomVeryLongAndWordyId" >>.gitattributes && > echo "\$Id\$" >A && > echo "\$Id\$" >B && > + echo "\$MyCusomVeryLongAndWordyId\$" >C && > git add -A && > git commit -m id && > > - rm A B && > + rm A B C && > test_checkout_workers 2 git reset --hard && > hexsz=$(test_oid hexsz) && > grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A && > - grep "\\\$Id\\\$" B > + grep "\\\$Id\\\$" B && > + grep -E "\\\$MyCusomVeryLongAndWordyId: [0-9a-f]{$hexsz} \\\$" C > ) > ' The change to this test looks good. But there are other things related to the ident filter that are not covered here (as the test is mostly interested in how ident interacts with parallel checkout). For example, the test doesn't check whether the "cleaning" direction is performed correctly, it just checks the "smudging" part. (That's why it didn't catch the error I showed earlier.) You might want to take a look at the other ident tests in t0021-conversion.sh and perhaps addapt/copy some of them to ensure that the expected behavior persists when using a custom id name. For example, quickly changing the "$Id:" references to "$customId:" and replacing "ident" with "ident=customId", I seem to get at least one test failure: --- expected-output 2021-08-26 19:40:07.181596662 +0000 +++ expanded-keywords 2021-08-26 19:40:07.188263463 +0000 @@ -6,5 +6,5 @@ $customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $ $customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $ $customId: NoTerminatingSymbol -$customId: Foreign Commit With Spaces $ +$customId: bebd07c752ffffd6779e1056db5de66c3bb733ed $ $customId: NoTerminatingSymbolAtEOF \ No newline at end of file error: last command exited with $?=1 not ok 3 - expanded_in_repo (But I haven't dig further, and this was just a quick test, so it could be my fault.)