On 05/20/2013 06:44 PM, Jeff King wrote: > On Mon, May 20, 2013 at 04:42:38PM +0200, Michael Haggerty wrote: > >>>> * Many callers store the empty string ("") as the name; for example, >>>> most of the entries created during a run of rev-list have "" as >>>> their name. This means that lots of needless copies of "" are being >>>> made. I think that the best solution to this problem would be to >>>> store NULL rather than "" for such entries, but I haven't figured >>>> out all of the places where the name is used. >>> >>> Use strbufs? >>> >>> No allocation (except for the strbuf object itself) is needed for >>> empty strings, and string ownership and be transferred to and from it >>> to prevent extra copies. >> >> That would cost two extra size_t per object_array_entry. I have the >> feeling that this structure is used often enough that the extra overhead >> would be a disadvantage, but I'm not sure. >> >> The obvious alternative would be to teach users to deal with NULL and >> either add another constructor alternative that transfers string >> ownership or *always* transfer string ownership and change the callers >> to call xstrdup() if they don't already own the name string. I think I >> will try that approach first. > > You could use the same trick that strbuf does: instead of NULL, point to > a well-known empty string literal. Readers do not have to care about > this optimization at all; only writers need to recognize the well-known > pointer value. And since we do not update in place but only eventually > free, it really is just that anyone calling free() would do "if (name != > well_known_empty_string)". Yes, that sounds like the best solution. Ultimately there is only one writer, add_object_array_with_mode(), and it can do if (!name) entry->name = NULL; else if (!*name) entry->name = well_known_empty_string; else entry->name = xstrdup(name); This should be a lot less intrusive than what I was trying: to change callers who wrote name="" to write name=NULL instead. But it is a nightmare to find all of the code that reads name and decide whether they need to do entry->name ? entry->name : "" because that in turn depends on whether the code that wrote into the same object_array always/never/sometimes wrote strings vs. NULL to the name field. Blech, encapsulation is tough in C. While I was chasing down callers, I came across other gems like builtin/checkout.c: add_pending_object(&revs, object, sha1_to_hex(object->sha1)); revision.c: add_pending_object(revs, object, sha1_to_hex(object->sha1)); submodule.c: add_pending_object(rev, &list->item->object, sha1_to_hex(list->item->object.sha1)); so apparently I wasn't the only one befuddled by the lifetime and ownership of the name field of object_array_entry. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html