At GitLab we have recently received a report where a repository was left with a corrupted `packed-refs` file after the node hard-crashed even though `core.fsync=reference` was set. This is something that in theory should not happen if we correctly did the atomic-rename dance to: 1. Write the data into a temporary file. 2. Synchronize the temporary file to disk. 3. Rename the temporary file into place. So if we crash in the middle of writing the `packed-refs` file we should only ever see either the old or the new state of the file. And while we do the dance when writing the `packed-refs` file, there is indeed one gotcha: we use a `FILE *` stream to write the temporary file, but don't flush it before synchronizing it to disk. As a consequence any data that is still buffered will not get synchronized and a crash of the machine may cause corruption. Fix this bug by flushing the file stream before we fsync. Signed-off-by: Patrick Steinhardt <ps@xxxxxx> --- refs/packed-backend.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index c1c71d183e..6f5a0709fb 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -1263,7 +1263,8 @@ static int write_with_updates(struct packed_ref_store *refs, goto error; } - if (fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) || + if (fflush(out) || + fsync_component(FSYNC_COMPONENT_REFERENCE, get_tempfile_fd(refs->tempfile)) || close_tempfile_gently(refs->tempfile)) { strbuf_addf(err, "error closing file %s: %s", get_tempfile_path(refs->tempfile), -- 2.39.0
Attachment:
signature.asc
Description: PGP signature