On Thu, Jul 18, 2013 at 03:05:49PM +0300, Juha Aatrokoski wrote: > On Tue, 16 Jul 2013, Kent Overstreet wrote: > > >On Tue, Jul 16, 2013 at 09:14:09PM +0300, Juha Aatrokoski wrote: > >>On Fri, 12 Jul 2013, Juha Aatrokoski wrote: > >>>>Can you give this patch a try? It's on top of the current > >>>>bcache-for-3.11 branch > >>> > >>>OK, now running the same kernel with this patch applied and > >>>discard enabled. However, it has previously taken my system 2-4 > >>>days to trigger this bug, so I'd say at least two weeks before I > >>>can say the patch (may have) fixed the issue. > >> > >>No such luck, hit the bug after four days of uptime. Disabling > >>discard fixed the problem so at least it's not any worse than > >>before. > > > >Argh, damn peculiar bug... and the fact that it takes so long to trigger > >is frustrating. I'm honestly at a loss at this point as to what that IO > >actually is. > > One thing I noticed is that your patch only affects the allocator, > the journal still does discards the old way. Perhaps it's worth a > try to apply a similar change to the journal discards? Oh man, thanks for pointing me at that code. This looks like a brown paper bag bug... Try this patch and tell me what happens: >From 72c531ee46e73a63739aa3fd10130f167d6bd30d Mon Sep 17 00:00:00 2001 From: Kent Overstreet <kmo@xxxxxxxxxxxxx> Date: Thu, 18 Jul 2013 10:50:55 -0700 Subject: [PATCH] Fix a dumb journal discard bug diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c index ba95ab8..c0017ca 100644 --- a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c @@ -428,7 +428,7 @@ static void do_journal_discard(struct cache *ca) return; } - switch (atomic_read(&ja->discard_in_flight) == DISCARD_IN_FLIGHT) { + switch (atomic_read(&ja->discard_in_flight)) { case DISCARD_IN_FLIGHT: return; -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html