On Tue, 30 Apr 2013, Kent Overstreet wrote:
Argh.
That hang took a lot longer to show up than last time, didn't it?
Wonder if I might have fixed part of the problem.
Well, not exactly. Before this I triggered the bug two times, the first
time after four days of uptime and the second after two days, so since
this time was also four days it's hard to say whether this bug has been
affected at all. But yes, since this is my main desktop computer and the
workload on it is non-artificial and not very heavy, it does take a long
time to trigger this bug. Perhaps Heiko can trigger it faster, assuming
his setup still has this bug?
I do have a spare computer and disks at work that I can use for a test
setup (with the same kernel and block stack), though it will take a few
days to free up an SSD for it. If you can provide scripts to generate the
workload then maybe the bug could be reproduced faster.
On Tue, Apr 30, 2013 at 9:13 AM, Juha Aatrokoski <jha@xxxxxxxxxxx> wrote:
On Fri, 26 Apr 2013, Juha Aatrokoski wrote:
On Sun, 21 Apr 2013, Juha Aatrokoski wrote:
On Sat, 20 Apr 2013, Kent Overstreet wrote:
On Thu, Apr 18, 2013 at 02:42:59PM +0300, Juha Aatrokoski wrote:
I ran into the same bug as Heiko Wundram reported a while back:
after a few (2-4) days of normal desktop usage, bcache hangs and
dstat shows continuous 50MB/s write to the cache SSD partition with
one CPU core maxed out in IO wait. I remembered Kent's answer to the
later message, and at least echoing 0 to writeback_running did
nothing in my system. The bcache device consists of an SSD partition
and two disks in md raid0, and there is dmcrypt on top of bcache.
The branch was bcache patched to a 3.7.10 (Gentoo) kernel. Is bcache
the correct branch, or should I be using bcache-for-upstream? (There
are many branches in the git repo, but I can't find any
documentation/description on what they are and who should use them.)
I had to revert back to the previous setup, which was bcache-3.2
patched to a 3.6.11 kernel and which doesn't have the problem. But
it looks like bcache-3.2 cannot be trivially (i.e. without more
detailed knowledge about bcache and the bio subsystem) ported to 3.7
kernels, so I guess I'll be stuck with 3.6 until the bug is fixed.
So, I'm not having any luck with reproducing it, but I do have a
possible fix in the bcache-testing branch - any chance you can give that
a try and let me know if it works?
OK, I'll test it when I have time, which may not be until next weekend.
I just patched my 3.7.10 Gentoo kernel with the latest bcache branch
(which I understand should have the fix for this) and rebooted. I'll let you
know if the problem still persists.
No luck, same symptoms after four days running the latest bcache branch.
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html