Jeff King <peff@xxxxxxxx> writes: > This is weirdly specific. Can we accomplish the same thing with existing > tools? > > E.g., could: > > git cat-file --batch-all-objects --batch-check='%(objectname)' | > shuffle | > head -n 100 > > do the same thing? > > I know that "shuffle" isn't available everywhere, but I'd much rather > see us fill in portability gaps in a general way, rather than > introducing one-shot C code that needs to be maintained (and you > wouldn't _think_ that t/helper programs need much maintenance, but try > perusing "git log t/helper" output; they have to adapt to the same > tree-wide changes as the rest of the code). I was thinking about this a bit more, and came to the conclusion that "sort -R" and "shuf" are wrong tools to use. We would want to measure with something close to real world workload. for example, letting git rev-list --all --objects produce the listof objects in traversal order (i.e. this is very similar to the order in which "git log -p" needs to access the objects) and chomping at the number of sample objects you need in your test would give you such a list.