In bch_btree_leaf_dirty() when increase bcache journal pin counter, current code uses atomic_inc(w->journal) directly. This is problematic indeed, which may cause following code in journal.c:journal_reclaim() not work properly, 610 while (!atomic_read(&fifo_front(&c->journal.pin))) 611 fifo_pop(&c->journal.pin, p); The above code piece is protected by spinlock c->journal.lock, and the atomic counter w->journal in btree.c:bch_btree_leaf_dirty() is one of the nodes from c->journal.pin. If the above while() loop just happens to reach a fifo node which is w->journal in bch_btree_leaf_dirty(), it is possible that the between line 610 and 611 the counter w->journal is increased but poped off in journal_reclaim(). Then the journal jset which w->journal referenced in bch_btree_leaf_dirty() gets lost. If system crashes or reboots before bkeys of the lost jset flushing back to bcache btree node, journal_replay() after the reboot may complains some journal entries lost and fail to register cache set. Such race condition is very rare to happen, I observe such issue when I modify the journal buckets number to 3, which makes only a limited number of jset being available. Then it is possible to observe journal replay failure due to lost journal jset(s). Signed-off-by: Coly Li <colyli@xxxxxxx> --- drivers/md/bcache/btree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 23cb1dc7296b..ac1b9159402e 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -551,7 +551,9 @@ static void bch_btree_leaf_dirty(struct btree *b, atomic_t *journal_ref) if (!w->journal) { w->journal = journal_ref; + spin_lock(&b->c->journal.lock); atomic_inc(w->journal); + spin_unlock(&b->c->journal.lock); } } -- 2.16.4