Jeff King <peff@xxxxxxxx> writes: > On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> I.e. something to generate the .gitattributes file using this format: >> >> https://git-scm.com/docs/gitattributes#_packing_objects >> >> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if >> there's some repo scanner utility to spew this out for a given repo. > > I'm not sure what you mean by "un-delta-able" objects. Do you mean ones > where we're not likely to find a delta? Or ones where Git will not try > to look for a delta? > > If the latter, I think the only rules are the "-delta" attribute and the > object size. You should be able to use git-check-attr and "git-cat-file" > to get that info. > > If the former, I don't know how you would know. We can only report on > what isn't a delta _yet_. I am reasonably sure that the question is about solving the former so that "-delta" attribute is set appropriately. Iniitially, I thought that it is likely an undeltifiable object has higher randomness than deltifiable ones and that can be exploited, but if you have such a highly random blob A (and no other object like it) in the repository and then later acquire another blob B that happens to share most of the data with A, then A and B by themselves will pass the "highly random" test but still yet each can be expressed as a delta derived from the other. So your "what isn't a delta yet" is a reasonable assessment of what mechanically can be known. Knowledge/heuristic like "No two '*.gpg' files are expected to be alike" needs something more than the randomness of individual files, I guess.