[RFC PATCH v1 1/6] bcache: acquire c->journal.lock in bch_btree_leaf_dirty()

Coly Li <colyli@xxxxxxx> · Sat, 2 Mar 2019 21:47:28 +0800

In bch_btree_leaf_dirty() when increase bcache journal pin counter,
current code uses atomic_inc(w->journal) directly. This is problematic
indeed, which may cause following code in journal.c:journal_reclaim()
not work properly,
 610 while (!atomic_read(&fifo_front(&c->journal.pin)))
 611 	fifo_pop(&c->journal.pin, p);

The above code piece is protected by spinlock c->journal.lock, and
the atomic counter w->journal in btree.c:bch_btree_leaf_dirty() is one
of the nodes from c->journal.pin. If the above while() loop just happens
to reach a fifo node which is w->journal in bch_btree_leaf_dirty(),
it is possible that the between line 610 and 611 the counter w->journal
is increased but poped off in journal_reclaim(). Then the journal jset
which w->journal referenced in bch_btree_leaf_dirty() gets lost.

If system crashes or reboots before bkeys of the lost jset flushing back
to bcache btree node, journal_replay() after the reboot may complains
some journal entries lost and fail to register cache set.

Such race condition is very rare to happen, I observe such issue when
I modify the journal buckets number to 3, which makes only a limited
number of jset being available. Then it is possible to observe journal
replay failure due to lost journal jset(s).

Signed-off-by: Coly Li <colyli@xxxxxxx>
---
 drivers/md/bcache/btree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 23cb1dc7296b..ac1b9159402e 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -551,7 +551,9 @@ static void bch_btree_leaf_dirty(struct btree *b, atomic_t *journal_ref)
 
 		if (!w->journal) {
 			w->journal = journal_ref;
+			spin_lock(&b->c->journal.lock);
 			atomic_inc(w->journal);
+			spin_unlock(&b->c->journal.lock);
 		}
 	}
 
-- 
2.16.4