2018-03-09 2:19 GMT+09:00 René Scharfe <l.s.r@xxxxxx>: > Am 08.03.2018 um 13:06 schrieb Takuto Ikuta: >> +static int add_loose_objects_to_set(const struct object_id *oid, >> + const char *path, >> + void *data) >> +{ >> + struct oidset* set = (struct oidset*)(data); > > This cast is not needed (unlike in C++). And the asterisk should be stuck > to the variable, not the type (see Documentation/CodingGuidelines). > >> + oidset_insert(set, oid); > > In fact, you could just put "data" in here instead of "set" (without a > cast), with no loss in readability or safety. > Thank you for review, changed to use data directly in v2. >> + return 0; >> +} >> + >> static int everything_local(struct fetch_pack_args *args, >> struct ref **refs, >> struct ref **sought, int nr_sought) >> @@ -719,16 +728,21 @@ static int everything_local(struct fetch_pack_args *args, >> int retval; >> int old_save_commit_buffer = save_commit_buffer; >> timestamp_t cutoff = 0; >> + struct oidset loose_oid_set = OIDSET_INIT; >> + >> + for_each_loose_object(add_loose_objects_to_set, &loose_oid_set, 0); >> >> save_commit_buffer = 0; >> >> for (ref = *refs; ref; ref = ref->next) { >> struct object *o; >> + unsigned int flag = OBJECT_INFO_QUICK; >> >> - if (!has_object_file_with_flags(&ref->old_oid, >> - OBJECT_INFO_QUICK)) >> - continue; >> + if (!oidset_contains(&loose_oid_set, &ref->old_oid)) >> + flag |= OBJECT_INFO_SKIP_LOOSE; >> >> + if (!has_object_file_with_flags(&ref->old_oid, flag)) >> + continue; >> o = parse_object(&ref->old_oid); >> if (!o) >> continue; >> @@ -744,6 +758,8 @@ static int everything_local(struct fetch_pack_args *args, >> } >> } >> >> + oidset_clear(&loose_oid_set); >> + > > This part looks fine to me. (Except perhaps call the variable "flags" > because you sometimes have two?) > Changed flag names. > Why not include packed objects as well? Probably because packs have > indexes which can queried quickly to determine object existence, and > because there are only few loose objects in typical repositories, > right? > Correct. In my target repository, chromium, fetch-pack's slowness comes from many lstat to non-existing loose objects for remote refs. I focus on to remove such lstat. > A similar cache was introduced by cc817ca3ef (sha1_name: cache > readdir(3) results in find_short_object_filename()) to speed up > finding unambiguous shorts object hashes. I wonder if it could be > used here as well, but I don't see an easy way. > >> if (!args->no_dependents) { >> if (!args->deepen) { >> for_each_ref(mark_complete_oid, NULL); >> diff --git a/sha1_file.c b/sha1_file.c >> index 1b94f39c4..c903cbcec 100644 >> --- a/sha1_file.c >> +++ b/sha1_file.c >> @@ -1262,6 +1262,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi, >> if (find_pack_entry(real, &e)) >> break; >> >> + if (flags & OBJECT_INFO_SKIP_LOOSE) >> + return -1; >> + >> /* Most likely it's a loose object. */ >> if (!sha1_loose_object_info(real, oi, flags)) >> return 0; >> > > This early return doesn't just skip checking loose objects. It > also skips reloading packs and fetching missing objects for > partial clones. That may not be a problem for fetch-pack, but > it means the flag has a misleading name. Do you get the same > performance improvement if you make it only skip that > sha1_loose_object_info() call? > I changed the flag name following Junio's suggestion. If we skip sha1_loose_object_info call, reprepare_packed_git(...) runs for every non-existing refs and git fetch becomes slower. Takuto