On Wed, Jul 31, 2024 at 08:04:17PM -0500, Justin Tobler wrote: > On 24/07/31 04:15PM, Patrick Steinhardt wrote: > > But there is a bug in the code. Suppose we have two processes which are > > compacting two slices of the table. Given that we lock each of the > > tables before compacting them, we know that the slices must be disjunct > > from each other. But regardless of that, compaction performed by one > > process will always impact what the other process needs to write to the > > "tables.list" file. > > I'm not quite sure I understand at this point how it is possible for two > compaction operations to be performed concurrently. Wouldn't there > always be overlap between the two compaction segments thus causing one > of the operations to be unable to acquire all of the required locks and > abort? In practice we cannot assume anything about how another process compacts tables. While we can assume something about how a particular version of Git compacts tables, we cannot assume anything about future versions of Git or about alternate implementations of Git. The reftable backend allows for compacting only a subset of tables, and the heuristic is not mandated by the on-disk format except that the tables that we are about to compact need to be next to each other in the stack. Furthermore, with the next patch, we also handle it gracefully when some parts of the stack are locked already. Thus, it can easily happen that process A compacts tables 1 to 3, whereas process B will try to compact tables 1 to 5, fail to acquire the lock for table 3, and then reduce the range to compact to 3 to 5. > > changed after we have locked it for the second time in (5). This has the > > consequence that we will always commit the old, cached in-core tables to > > disk without paying to respect what the other process has written. This > > scenario would then lead to data loss and corruption. > > If a concurrent compaction happens though, it would mess up the indices > and cause problems when writting the "tables.list" file. That would not > be good. Yup. > > This can even happen in the simpler case of one compacting process and > > one writing process. The newly-appended table by the writing process > > would get discarded by the compacting process because it never sees the > > new table. > > This is indeed a problem. Since we don't reload the stack, we are > unaware of any concurrently append tables causing them to not be > written in the new "tables.list" file. Scary Indeed. > > + /* > > + * We have found the new range that we want to replace, so > > + * let's update the range of tables that we want to replace. > > + */ > > + last_to_replace = last + (new_offset - first); > > + first_to_replace = new_offset; > > + } else { > > + REFTABLE_CALLOC_ARRAY(names, st->merged->stack_len + 1); > > I was confused at first by the `stack_len` + 1. The extra element is > NULL which tells us there are no more tables to add to the list, > correct? It looks like `fd_read_lines()` also adds an extra element. Yes, that's the reason why we have it. We end up passing `names` to `free_names()`, which uses `NULL` as a sentinel value to know when to stop iterating over the array's entries. I'll add a comment. Thanks for your review. I'll wait a bit longer before sending out another version of this patch series to wait for some more feedback. Patrick
Attachment:
signature.asc
Description: PGP signature