Denton Liu <liu.denton@xxxxxxxxx> writes: > The code to test that objects were not missing from the packfile was > duplicated many times. Extract the duplicated code into > test_no_missing_in_packs() and use that instead. > > Refactor the resulting extraction so that if any git commands fail, > their return codes are not silently lost. > > We were using sed to filter lines. Although not incorrect, this is > exactly what grep is built for. Replace this invocation of sed with grep > so that we use the correct tool for the job. Well, $ sed -n -e 's/required match/desired part of the line/p' is much much more approirate than $ grep -e "requred match" | extract desired part of the line "grep" is better only if the original were $ sed -n -e '/required match/p' but everybody would write it with grep to begin with ;-) So, I dunno about this part of the conversion. > Instead of verifying each file of `alt_objects/pack/*.idx` individually > in a for-loop, batch them together into one verification step. Do you mean this one? git verify-pack -v alt_objects/pack/*.idx where we may pass 1 or more .idx file to the command? At first my reading was interrupted by a "Huh?", but that does look good. We'd need to be a bit careful to make sure that we have at least 1 .idx file, as the shell will happily feed a file whose name is "*.idx", which verify-pack would be unhappy about. > The original testing construct was O(n^2): it used a grep in a loop to > test whether any objects were missing in the packfile. Rewrite this to > sort the files then use `comm -23` so that finding missing lines from > the original file is done more efficiently. OK. If we an show measurable speedups, it would be great, but the loop structure does look O(n^2) and unnecessary costly. > +test_no_missing_in_packs () { > + myidx=$(ls -1 .git/objects/pack/*.idx) && > + test_path_is_file "$myidx" && If there are 2 or more .idx files, or if there is none, $myidx would hopefully be a concatenation of these filenames or a string that ends with asterisk-dot-idx and would fail path_is_file. Sounds OK. Ah, I do not have to review this part---these are repeated patterns in the original. > + git verify-pack -v alt_objects/pack/*.idx >orig.raw && > + grep "^[0-9a-f]\{40\}" orig.raw | cut -d" " -f1 | sort >orig && If output from 'grep' can be used as-is, it is worth doing, but if you have to pipe it to cut, the original that used sed to filter and edit the line would probably be a better way to write it. > + git verify-pack -v $myidx >dest.raw && This part does not quote $myidx" (inherited from the original); it probably is OK, as any potentially problematic value in $myidx would have been caught as an error much earlier in this test.