Hi, this patch series addresses a couple of issues with compaction of the reftable stack. The original intent was to make compaction handle the case gracefully where a subset of the tables it wants to compact has been locked already. While working on that I found an issue where compaction may race with a concurrent writer that modified the "tables.list" file while compacting it. In theory, this situation should work alright and has been accounted for in the design: the compacting process locks "tables.list", then locks all of the tables it wants to compact, unlocks "tables.list", compacts the data and re-locks "tables.list" to update the stack. No concurrent process would thus be able to compact the same tables, and thus we know that the tables we have just compacted must still exist. What the code didn't do though is to reload the stack before writing the updated tables to "tables.list". This could either lead to us dropping new tables appended to the stack while we were compacting, or it could lead to us writing references to concurrently-compacted tables which have been removed meanwhile, leading to data loss. The fix itself is rather straight-forward. What I'm missing though is a way to test for this given that the logic only triggers when racing with another thread. I didn't have any idea how to do this, so if anybody else has an idea: please, share it! Otherwise, I'm not sure I feel comfortable with untested medium-complexity code. The alternative would be to just bail out when we see that the stack has been compacted, which is less ideal when we have just done a bunch of working compacting large tables. Thanks! Patrick Patrick Steinhardt (8): reftable/stack: refactor function to gather table sizes reftable/stack: test compaction with already-locked tables reftable/stack: update stats on failed full compaction reftable/stack: simplify tracking of table locks reftable/stack: do not die when fsyncing lock file files reftable/stack: use lock_file when adding table to "tables.list" reftable/stack: fix corruption on concurrent compaction reftable/stack: handle locked tables during auto-compaction reftable/stack.c | 228 +++++++++++++++++++++++++++++-------- reftable/stack_test.c | 104 +++++++++++++++++ t/t0610-reftable-basics.sh | 21 ++-- 3 files changed, 300 insertions(+), 53 deletions(-) base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4 -- 2.46.0.dirty
Attachment:
signature.asc
Description: PGP signature