On 11/21/2021 10:32 PM, Han Xin wrote: > From: Han Xin <hanxin.hx@xxxxxxxxxxxxxxx> > > When streaming a large blob object to "write_loose_object()", we have no > chance to run "write_object_file_prepare()" to calculate the oid in > advance. So we need to handle undetermined oid in function > "write_loose_object()". > > In the original implementation, we know the oid and we can write the > temporary file in the same directory as the final object, but for an > object with an undetermined oid, we don't know the exact directory for > the object, so we have to save the temporary file in ".git/objects/" > directory instead. My first reaction is to not write into .git/objects/ directly, but instead make a .git/objects/tmp/ directory and write within that directory. The idea is to prevent leaving stale files in the .git/objects/ directory if the process terminates strangely (say, a power outage or segfault). If this was an interesting idea to pursue, it does leave a question: should we clean up the tmp/ directory when it is empty? That would require adding a check in finalize_object_file() that is probably best left unchecked (the lstat() would add a cost per loose object write that is probably too costly). I would rather leave an empty tmp/ directory than add that cost per loose object write. I suppose another way to do it would be to register the check as an event at the end of the process, so we only check once, and that only happens if we created a loose object with this streaming method. With all of these complications in mind, I think cleaning up the stale tmp/ directory could (at the very least) be delayed to another commit or patch series. Hopefully adding the directory is not too much complication to add here. > - loose_object_path(the_repository, &filename, oid); > + if (is_null_oid(oid)) { > + /* When oid is not determined, save tmp file to odb path. */ > + strbuf_reset(&filename); > + strbuf_addstr(&filename, the_repository->objects->odb->path); > + strbuf_addch(&filename, '/'); Here, you could instead of the strbuf_addch() do strbuf_add(&filename, "/tmp/", 5); if (safe_create_leading_directories(filename.buf)) { error(_("failed to create '%s'")); strbuf_release(&filename); return -1; } > + } else { > + loose_object_path(the_repository, &filename, oid); > + } > > fd = create_tmpfile(&tmp_file, filename.buf); > if (fd < 0) { > @@ -1939,12 +1946,31 @@ static int write_loose_object(const struct object_id *oid, char *hdr, > die(_("deflateEnd on object %s failed (%d)"), oid_to_hex(oid), > ret); > the_hash_algo->final_oid_fn(¶no_oid, &c); > - if (!oideq(oid, ¶no_oid)) > + if (!is_null_oid(oid) && !oideq(oid, ¶no_oid)) > die(_("confused by unstable object source data for %s"), > oid_to_hex(oid)); > > close_loose_object(fd); > > + if (is_null_oid(oid)) { > + int dirlen; > + > + oidcpy((struct object_id *)oid, ¶no_oid); > + loose_object_path(the_repository, &filename, oid); > + > + /* We finally know the object path, and create the missing dir. */ > + dirlen = directory_size(filename.buf); > + if (dirlen) { > + struct strbuf dir = STRBUF_INIT; > + strbuf_add(&dir, filename.buf, dirlen - 1); > + if (mkdir(dir.buf, 0777) && errno != EEXIST) > + return -1; > + if (adjust_shared_perm(dir.buf)) > + return -1; > + strbuf_release(&dir); > + } > + } > + Upon first reading I was asking "where is the file rename?" but it is part of finalize_object_file() which is called further down. Thanks, -Stolee