On 12/14/21 6:04 AM, Mauricio Faria de Oliveira wrote:
Hey Kent and Coly,
It turns out that, at least for the disk image that reproduces the issue,
the closure from bch_btree_set_root() to bch_journal_meta() doesn't make
a difference; the stall is in bch_journal() -> journal_wait_for_write().
So the previous suggestion to skip bch_journal_meta() altogether works,
to get things going.. of course, checking for journal replay/full case.
What do you think of this patch?
It simply checks the conditions in run_cache_set() for bch_journal_replay().
(it starts w/ unlikely(!CACHE_SET_RUNNING) to quickly get to the usual case,
and apparently has an extra strict check for !gc_thread, just in case.
And it is journal_full() only, as the !journal_full() case in journal_wait_
for_write() seems to be handled via another function per the comment.)
This works w/ the disk image here.
Hi Mauricio,
The following patch might work but not a proper fix. I am in travel
recently, and hope soon I may have time to refine a patch for such
non-space issue for journal.
I have a patch but it need to be rebased with the latest bcache code.
Thanks.
Coly Li
Thanks!
Mauricio
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 72abe5cf4b12..bedeffc3ae28 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -2477,9 +2477,6 @@ int bch_btree_insert(struct cache_set *c, struct keylist *keys,
void bch_btree_set_root(struct btree *b)
{
unsigned int i;
- struct closure cl;
-
- closure_init_stack(&cl);
trace_bcache_btree_set_root(b);
@@ -2494,8 +2491,18 @@ void bch_btree_set_root(struct btree *b)
b->c->root = b;
- bch_journal_meta(b->c, &cl);
- closure_sync(&cl);
+ /* Don't journal during replay if journal is full (prevents deadlock) */
+ if (unlikely(!test_bit(CACHE_SET_RUNNING, &b->c->flags)) &&
+ CACHE_SYNC(&b->c->cache->sb) && b->c->gc_thread == NULL &&
+ journal_full(&b->c->journal)) {
+ pr_info("Not journaling new root (replay with full journal)\n");
+ } else {
+ struct closure cl;
+
+ closure_init_stack(&cl);
+ bch_journal_meta(b->c, &cl);
+ closure_sync(&cl);
+ }
}
/* Map across nodes or keys */