On Fri, Jun 14, 2024 at 08:35:56AM -0700, Junio C Hamano wrote: > My knee-jerk reaction is that we probably can safely loosen by > ignoring non-blob .gitWHATEVER files, but security-minded folks may > be able to come up with some plausible attack scenarios if we did > so. > > Comments from those who have worked on transfer time and runtime > checks on these are highly appreciated. I can't think of an immediate problem security-wise, since we know that Git won't actually read the trees. In fact, I was a little surprised that normal Git commands outside of fsck did not run into problems with a .gitattributes directory. What happens on Linux, at least, is that we try to fopen() the directory, which works, and then read() returns EISDIR. But because we do so through strbuf_getline(), it just looks like EOF and we don't complain. Arguably that code (and all the other loops like it) should check ferror() and complain. But I'd also suspect that other non-POSIX platforms would see the error at open(), and _would_ actually produce an error message. I seem to recall running into this before with Windows, maybe? So even if we loosened fsck, I'm not sure if it's something we want to support. And of course my bigger question is: why? These are reserved names that have special meaning to Git. Sticking stuff that Git doesn't understand and may produce errors for seems odd. > Having said that, the checks for .gitmodules and .gitattributes in > fsck.c first collect objects that tree entries with these names > point at into oidsets (this all happens in fsck.c:fsck_tree()), but > the actual check for these found objects are done only when they are > blobs. Only when we encounter a blob object, these oidsets are > looked at in fsck.c:fsck_blob(), and if it is .gitmodules its > contents inspected (and may result in a warning or an error). So > the "checks" Alexey reports may not be in the runtime or transfer > time checks done in fsck but something else. I dunno. The fsck checks will kick in. When we fsck the containing tree, we learn that some oid X is a .gitmodules file, and queue that. The hope is that later we're fsck-ing X anyway, and we get to validate for free-ish. But if we _don't_ see it (either we checked the blob before the tree, or perhaps the blob is not even part of the set of objects we're checking), then at the end we validate any queued items that are left. We have to do it this way to catch the case of somebody pushing up blob X in one push (perhaps with the name "foo"), and then doing a second push that references it with a new name (i.e., renaming "foo" to ".gitmodules"). -Peff