On Tue, Mar 08, 2016 at 04:08:21PM +0100, Ævar Arnfjörð Bjarmason wrote: > What I really want is something for git-log more like > git-for-each-ref, so I could emit the following info for each file > being modified delimited by some binary marker: > > - file name before > - file name after > - is rename? > - is binary? > - size in bytes before > - size it bytes after > - removed lines > - added lines If you get the full sha1s of each object (e.g., by adding --raw), then you can dump them all to a single cat-file invocation to efficiently get the sizes. I'm not quite sure I understand why you want to know about renames and added/removed lines if you are just blocking binary files. If I were implementing this[1], I'd probably just block based on blob size, which you can do with: git rev-list --objects $old..$new | git cat-file --batch-check='%(objectsize) %(objectname) %(rest)' | perl -alne 'print if $F[0] > 1_000_000; # or whatever' | while read size sha1 file; do echo "Whoops, $file ($sha1) is too big" exit 1 done You can also use %(objectsize:disk) to get the on-disk size (which can tell you about things that don't compress well, which tend to be the sorts of things you are trying to keep out). You can't ask about binary-ness, but I don't think it would unreasonable for cat-file to have a "would git consider this content binary?" placeholder for --batch-check. The other things are properties of the comparison, not of individual objects, so you'll have to get them from "git log". But with some clever scripting, I think you could feed those sha1s (or $commit:$path specifiers) into a single cat-file invocation to get the before/after sizes. -Peff [1] GitHub has hard and soft limits for various blob sizes, and at one point the implementation looked very similar to what I showed here. The downside is that for a large push, the rev-list can actually take a fair bit of time (e.g., consider pushing up all of the kernel history to a brand new repo), and this is on top of the similar work already done by index-pack and check_everything_connected(). These days I have a hacky patch to notice the too-big size directly in index-pack, which is essentially free. It doesn't know about the file path, so we pull that out later in the pre-receive hook. But we only have to do so in the uncommon case that there _is_ actually a too-big file, so normal pushes incur no penalty. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html