Hi, On Mon, 20 Apr 2020, Junio C Hamano wrote: > "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > diff --git a/reftable/update.sh b/reftable/update.sh > > new file mode 100644 > > Shouldn't this be executable, even though the readme tells the > reader to run "sh update.sh"? Do we really want to have such a shell script? To be honest, I'd rather do a subtree merge. In my mind, this should make it easier to maintain this code elsewhere, yet debug and fix it in git.git (because there is no doubt whatsoever in my mind that the vast majority of bug fixes in this code will come in via git.git, which _does_ make me wonder whether it makes sense to keep the reftable library separate, as if it was truly an independent part of Git: it is absolutely not). Of course, this would make it even more desirable to do away with code that reads like Java (and is probably a silent, defiant protest against Git's requirement to have no declarations after statements), e.g. int block_iter_next(struct block_iter *it, struct record rec) { if (it->next_off >= it->br->block_len) { return 1; } { struct slice in = { .buf = it->br->block.data + it->next_off, .len = it->br->block_len - it->next_off, }; struct slice start = in; struct slice key = { 0 }; byte extra; int n = decode_key(&key, &extra, it->last_key, in); if (n < 0) { return -1; } [...] } } The fact that this code uses a coding style that is very different from Git's, on purpose, makes it quite awkward to even so much as debug, say, the segmentation fault that was caused by my rewrite of the test script (which _also_ looks _very_ different from the existing test cases, adding to the awkwardness): test_expect_success 'basic operation of reftable storage' ' rm -rf .git && git init --ref-storage=reftable && mv .git/hooks .git/hooks-disabled && test_commit file && test_write_lines HEAD refs/heads/master refs/tags/file >expect && git show-ref && git show-ref | cut -f2 -d" " > actual && test_cmp actual expect && for count in $(test_seq 1 10) do test_commit "number $count" file.t $count number-$count || return 1 done && git gc && ls -1 .git/reftable >table-files && test_line_count = 2 table-files && git reflog refs/heads/master >output && test_line_count = 11 output && grep "commit (initial): first post" output && grep "commit: number 10" output ' The fact that this rather trivial usage can still cause a segmentation fault makes me believe that this rationale for the purposefully-different coding style implies a fundamental fallacy (see https://github.com/git-for-windows/git/commit/00eb1fd4496ef763f840878fb5249507ae83af43#commitcomment-38614650): Each time you use an unitialized variable, you put another mental burden on the reader to check that it is actually OK in that place, and the Git source code (which puts the declaration and use) far apart makes this harder to verify. The major challenge with writing C code these days, especially when you come from a background of Java or other languages that take care of memory management "for you" is that you _do_ have to manage memory in C, explicitly. And if you already start not paying attention to the initialization of local variables, you go down the path where all of a sudden you end up with a segmentation fault and have no idea where it is coming from, and the most likely explanation is that the memory was managed incorrectly. So I would recommend to actually not get sloppy with confusing declaration and initialization and initializing unnecessarily and trying to get cute with introducing code blocks "just to introduce a narrower scope for local variables". Instead, I would suggest to stick with the style Git developed; We have gotten relatively good with keeping our memory management straight that way. Ciao, Dscho