Re: [PATCH 4/4] fetch-pack: print and use dangling .gitmodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Sun, Jan 24 2021, Jonathan Tan wrote:
> >  --fsck-objects::
> > -	Die if the pack contains broken objects. For internal use only.
> > +	For internal use only.
> > ++
> > +Die if the pack contains broken objects. If the pack contains a tree
> > +pointing to a .gitmodules blob that does not exist, prints the hash of
> > +that blob (for the caller to check) after the hash that goes into the
> > +name of the pack/idx file (see "Notes").
> 
> [I should have waited a bit and sent one E-Mail]
> 
> Is this really generally usable as an IPC mechanism, what if we need
> another set of OIDs we care about? Shouldn't it at least be hidden
> behind some option so you don't get a deluge of output from index-pack
> if you're not in this packfile-uri mode?

--fsck-objects is only for internal use, and it's only used by
fetch-pack.c. So its only consumer does want the output.

Junio also mentioned the possibility of another set of OIDs, and I
replied [1].

[1] https://lore.kernel.org/git/20210128003536.3874866-1-jonathantanmy@xxxxxxxxxx/

> But, along with my other E-Mail...
> 
> > [...]
> > +static void parse_gitmodules_oids(int fd, struct oidset *gitmodules_oids)
> > +{
> > +	int len = the_hash_algo->hexsz + 1; /* hash + NL */
> > +
> > +	do {
> > +		char hex_hash[GIT_MAX_HEXSZ + 1];
> > +		int read_len = read_in_full(fd, hex_hash, len);
> > +		struct object_id oid;
> > +		const char *end;
> > +
> > +		if (!read_len)
> > +			return;
> > +		if (read_len != len)
> > +			die("invalid length read %d", read_len);
> > +		if (parse_oid_hex(hex_hash, &oid, &end) || *end != '\n')
> > +			die("invalid hash");
> > +		oidset_insert(gitmodules_oids, &oid);
> > +	} while (1);
> > +}
> > +
> 
> Doesn't this IPC mechanism already exist in the form of fsck.skipList?
> See my 1f3299fda9 (fsck: make fsck_config() re-usable, 2021-01-05) on
> "next". I.e. as noted in my just-sent-E-Mail you could probably just
> re-use skiplist as-is.

I'm not sure how fsck.skipList could be used here. Before running
fsck_finish() for the first time, we don't know which .gitmodules are
missing and which are not. And when running fsck_finish() for the second
time, we definitely do not want to skip any blobs.

> Or if not it seems to me that this whole IPC mechanism would be better
> done with a tempfile and passing it along like we already pass the
> fsck.skipList between these processes.
> 
> I doubt it's going to be large enough to matter, we could just put it in
> .git/ somewhere, like we put gc.log etc (but created with a mktemp()
> name...).
> 
> Or if we want to keep the "print <list> | process" model we can refactor
> the existing fsck IPC noted in 1f3299fda9 a bit, so e.g. you pass some
> version of "lines prefixed with "fsck-skiplist: " go into list xyz via a
> command-line option. And then existing option(s) and your potential new
> list (which as noted, I think is probably redundant to the skiplist) can
> use it.

I think using stdout is superior to using a tempfile - we don't have to
worry about interrupted invocations, for example.

What do you mean by "the existing fsck IPC noted in 1f3299fda9"? If you
mean the ability to pass a list of OIDs, for example using "-c
fsck.skipList=filename.txt", I'm not sure that it solves anything.
Firstly, I don't think that the skipList is useful here (as I said
earlier). And secondly, I don't think that OID input is the issue -
right now, the design is a process (index-pack, calling fsck_finish())
writing to its output which is then picked up by the calling process
(fetch-pack). We are not sending the dangling .gitmodules through stdin
anywhere.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux