Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> wrote:
> On Fri, Aug 05, 2016 at 08:26:30AM +0000, Eric Wong wrote:
> 
> > > I'm not sure which mallocs you mean. I allocate one struct per node,
> > > which seems like a requirement for a linked list. If you mean holding an
> > > extra list struct around an existing pointer (rather than shoving the
> > > prev/next pointers into the pointed-to- item), then yes, we could do
> > > that. But it feels like a bit dirty, since the point of the list is
> > > explicitly to provide an alternate ordering over an existing set of
> > > items.
> > 
> > This pattern to avoid that one malloc-per-node using list_entry
> > (container_of) is actually a common idiom in the Linux kernel
> > and Userspace RCU (URCU).  Fwiw, I find it less error-prone and
> > easier-to-follow than the "void *"-first-element thing we do
> > with hashmap.
> 
> My big problem with it is that it gets confusing when a particular
> struct is a member of several lists. We have had bugs in git where
> a struct ended up being used in two different lists, but accidentally
> using the same "next" pointer.

It might actually be easier since you would rarely (if ever)
touch the "next"/"prev" fields in your code.  This encourages
users to give meaningful names to list_head fields.

> So you need one "list_head" for each list that your struct may be a part
> of. Sometimes that's simple, but it's awkward when the code which wants
> the list is different than the code which "owns" the struct. Besides
> leaking concerns across modules, the struct may not want to pay the
> memory price for storing pointers for all of the possible lists it could
> be a member of.

Yes, the key is this list is flexible enough to be used either way:

	/* there are millions of these structs in practice */
	struct common_struct {
		struct list_head hot_ent;
		...
	};

	/* and only a handful of these */
	struct rarely_used_wrapper {
		struct list_head cold_ent;
		struct common_struct *common;
		...
	};

> For instance, I think it would be a mistake to convert the current
> commit_list code to something like this.

Of course, often a doubly-linked list is not needed or the extra
pointer is too expensive.  Linux/URCU have hlist for this reason.

I'm no expert on git internals, either; but there can be
readability improvements, too.  For example, I find http-walker.c
is easier-to-follow after 94e99012fc7a
("http-walker: reduce O(n) ops with doubly-linked list")
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]