Changes since v1: - Reworded patch one to remove misleading sentence about the locking during step 2. - In patch two, let add_delta_base_cache() take full ownership of the base, properly releasing it when the insertion is skipped. Matheus Tavares (2): packfile: fix race condition on unpack_entry() packfile: fix memory leak in add_delta_base_cache() packfile.c | 48 +++++++++++++++++++++++++++++------------------- 1 file changed, 29 insertions(+), 19 deletions(-) Range-diff against v1: 1: 42a7948f94 ! 1: 948d07673f packfile: fix race condition on unpack_entry() @@ Commit message objects. To fix the race condition (and later segmentation fault), let's reorder - the aforementioned steps so that the lock is not released between 1. - and 3. An alternative solution which would not require the reordering - would be to duplicate `base` before inserting it in the cache. - However, as Phil Hord mentioned, memcpy()'ing large bases can negatively - affect performance: in his experiments, this alternative approach slowed - git-grep down by 10% to 20%. + the aforementioned steps so that `base` is only added to the cache at + the end. This will prevent the buffer from being released by another + thread while it is still in use. An alternative solution which would not + require the reordering would be to duplicate `base` before inserting it + in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases + can negatively affect performance: in his experiments, this alternative + approach slowed git-grep down by 10% to 20%. Reported-by: Phil Hord <phil.hord@xxxxxxxxx> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> 2: 5b6e3019e0 ! 2: f15f0c82fd packfile: fix memory leak in add_delta_base_cache() @@ Commit message When add_delta_base_cache() is called with a base that is already in the cache, no operation is performed. But the check is done after allocating space for a new entry, so we end up leaking memory on the early return. - Also, the caller always expect that the base will be inserted, so it - never free()'s it. To fix both of these memory leaks, let's move the + In addition, the caller never free()'s the base as it expects the + function to take ownership of it. But the base is not released when we + skip insertion, so it also gets leaked. To fix these problems, move the allocation of a new entry further down in add_delta_base_cache(), and - make the function return an integer to indicate whether the insertion - was performed or not. Then, make the caller free() the base when needed. + free() the base on early return. Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx> ## packfile.c ## @@ packfile.c: void clear_delta_base_cache(void) - } - } - --static void add_delta_base_cache(struct packed_git *p, off_t base_offset, -+static int add_delta_base_cache(struct packed_git *p, off_t base_offset, + static void add_delta_base_cache(struct packed_git *p, off_t base_offset, void *base, unsigned long base_size, enum object_type type) { - struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent)); @@ packfile.c: void clear_delta_base_cache(void) /* @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset, + * is unpacking the same object, in unpack_entry() (since its phases I * and III might run concurrently across multiple threads). */ - if (in_delta_base_cache(p, base_offset)) -- return; -+ return 0; +- if (in_delta_base_cache(p, base_offset)) ++ if (in_delta_base_cache(p, base_offset)) { ++ free(base); + return; ++ } delta_base_cached += base_size; @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_of ent->key.p = p; ent->key.base_offset = base_offset; ent->type = type; -@@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_offset, - hashmap_init(&delta_base_cache, delta_base_cache_hash_cmp, NULL, 0); - hashmap_entry_init(&ent->ent, pack_entry_hash(p, base_offset)); - hashmap_add(&delta_base_cache, &ent->ent); -+ return 1; - } - - int packed_object_info(struct repository *r, struct packed_git *p, -@@ packfile.c: void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset, - * thread could free() it (e.g. to make space for another entry) - * before we are done using it. - */ -- if (!external_base) -- add_delta_base_cache(p, base_obj_offset, base, base_size, type); -+ if (!external_base && !add_delta_base_cache(p, base_obj_offset, -+ base, base_size, type)) { -+ free(base); -+ } - - free(delta_data); - free(external_base); -- 2.28.0