> On Sun, Jan 24 2021, Jonathan Tan wrote: > > --fsck-objects:: > > - Die if the pack contains broken objects. For internal use only. > > + For internal use only. > > ++ > > +Die if the pack contains broken objects. If the pack contains a tree > > +pointing to a .gitmodules blob that does not exist, prints the hash of > > +that blob (for the caller to check) after the hash that goes into the > > +name of the pack/idx file (see "Notes"). > > [I should have waited a bit and sent one E-Mail] > > Is this really generally usable as an IPC mechanism, what if we need > another set of OIDs we care about? Shouldn't it at least be hidden > behind some option so you don't get a deluge of output from index-pack > if you're not in this packfile-uri mode? --fsck-objects is only for internal use, and it's only used by fetch-pack.c. So its only consumer does want the output. Junio also mentioned the possibility of another set of OIDs, and I replied [1]. [1] https://lore.kernel.org/git/20210128003536.3874866-1-jonathantanmy@xxxxxxxxxx/ > But, along with my other E-Mail... > > > [...] > > +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids) > > +{ > > + int len = the_hash_algo->hexsz + 1; /* hash + NL */ > > + > > + do { > > + char hex_hash[GIT_MAX_HEXSZ + 1]; > > + int read_len = read_in_full(fd, hex_hash, len); > > + struct object_id oid; > > + const char *end; > > + > > + if (!read_len) > > + return; > > + if (read_len != len) > > + die("invalid length read %d", read_len); > > + if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n') > > + die("invalid hash"); > > + oidset_insert(gitmodules_oids, &oid); > > + } while (1); > > +} > > + > > Doesn't this IPC mechanism already exist in the form of fsck.skipList? > See my 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) on > "next". I.e. as noted in my just-sent-E-Mail you could probably just > re-use skiplist as-is. I'm not sure how fsck.skipList could be used here. Before running fsck_finish() for the first time, we don't know which .gitmodules are missing and which are not. And when running fsck_finish() for the second time, we definitely do not want to skip any blobs. > Or if not it seems to me that this whole IPC mechanism would be better > done with a tempfile and passing it along like we already pass the > fsck.skipList between these processes. > > I doubt it's going to be large enough to matter, we could just put it in > .git/ somewhere, like we put gc.log etc (but created with a mktemp() > name...). > > Or if we want to keep the "print <list> | process" model we can refactor > the existing fsck IPC noted in 1f3299fda9 a bit, so e.g. you pass some > version of "lines prefixed with "fsck-skiplist: " go into list xyz via a > command-line option. And then existing option(s) and your potential new > list (which as noted, I think is probably redundant to the skiplist) can > use it. I think using stdout is superior to using a tempfile - we don't have to worry about interrupted invocations, for example. What do you mean by "the existing fsck IPC noted in 1f3299fda9"? If you mean the ability to pass a list of OIDs, for example using "-c fsck.skipList=filename.txt", I'm not sure that it solves anything. Firstly, I don't think that the skipList is useful here (as I said earlier). And secondly, I don't think that OID input is the issue - right now, the design is a process (index-pack, calling fsck_finish()) writing to its output which is then picked up by the calling process (fetch-pack). We are not sending the dangling .gitmodules through stdin anywhere.