There are two ways a 'struct repository' could be initialized before using: via initialize_the_repository() and repo_init(). The first way always initializes 'index' field because that's how it is before the introduction of 'struct repository'. Back then 'the_index' is always available (even if not loaded). The second way however leaves 'index' NULL and relies on repo_read_index() to allocate it on demand. The problem with the second way is that, the majority of our code base was written with 'the_index' (i.e. the first way) in mind, where dereferencing 'the_index' (or the 'index' field now) is always safe. The second way breaks this assumption. The 'index' field can be NULL until loading from disk, which could lead to segfaults like 581d2fd9f2 (get_oid: handle NULL repo->index, 2019-05-14). We have two options to handle this: either we audit the entire code base, adding 'is index NULL' when needed, or we make sure 'index' is never NULL to begin with. This patch goes with the second option, making sure that 'index' is always allocated after initialization. It's less effort than the first one, and also safer because you could still miss things during the code audit. The extra allocation cost is not a real concern. The 'index' field is still freed and reset to NULL in repo_clear(). But after that call, a lot more is missing in 'repo' and it can never be used again without going through reinitialization phase. So it should be fine. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> --- repository.c | 3 ++- repository.h | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/repository.c b/repository.c index 682c239fe3..ca58692504 100644 --- a/repository.c +++ b/repository.c @@ -160,6 +160,7 @@ int repo_init(struct repository *repo, struct repository_format format = REPOSITORY_FORMAT_INIT; memset(repo, 0, sizeof(*repo)); + repo->index = xcalloc(1, sizeof(*repo->index)); repo->objects = raw_object_store_new(); repo->parsed_objects = parsed_object_pool_new(); @@ -262,7 +263,7 @@ void repo_clear(struct repository *repo) int repo_read_index(struct repository *repo) { if (!repo->index) - repo->index = xcalloc(1, sizeof(*repo->index)); + BUG("the repo hasn't been setup"); return read_index_from(repo->index, repo->index_file, repo->gitdir); } diff --git a/repository.h b/repository.h index 4fb6a5885f..75c4f68b22 100644 --- a/repository.h +++ b/repository.h @@ -85,6 +85,7 @@ struct repository { /* * Repository's in-memory index. + * Cannot be NULL after initialization. * 'repo_read_index()' can be used to populate 'index'. */ struct index_state *index; @@ -132,6 +133,9 @@ struct submodule; int repo_submodule_init(struct repository *subrepo, struct repository *superproject, const struct submodule *sub); +/* + * Release all resources in 'repo'. 'repo' cannot be used again. + */ void repo_clear(struct repository *repo); /* -- 2.22.0.rc0.322.g2b0371e29a