On Wed, 15 Apr 2015, Nikolay Borisov wrote: > Thanks for the patches, I've applied them to 4.0 and am in the process of > testing that. > > Do you happen to know in which (if any) repo do those patches live and if > there is a way to "reliably" (e.g. a repo where they are being applied) track > them or have you just collected them from misc postings to the mailing list? I keep meaning to create a git repository of my own with all of the patches, but that is not publically available yet. For the moment, I have been collecting all of the patches as they show up on the mailing list. I review them to make sure that they make sense, then add them to my list of bcache fixes. The specific fix that you are looking for is probably the rcu_schedpatch, but the other ones are good for stability as well. -Eric > Regards, > Nikolay > > On 04/14/2015 11:03 PM, Eric Wheeler wrote: > > Apply all of the attached patches to your kernel and try again. > > > > I wish somebody would apply these upstream and get it into the official > > kernel. I have been carrying all of these patches with me for some time > > and they definitely make bcache more stable. > > > > -Eric > > > > > > -- > > Eric Wheeler, President eWheeler, Inc. dba Global Linux Security > > 888-LINUX26 (888-546-8926) Fax: 503-716-3878 PO Box 25107 > > www.GlobalLinuxSecurity.pro Linux since 1996! Portland, OR 97298 > > > > On Tue, 14 Apr 2015, Nikolay Borisov wrote: > > > > > Hello list, > > > > > > > > > I'm currently testing bcache with the following setup: > > > > > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > > > > > sda 8:0 0 1.8T 0 disk > > > ??sda1 8:1 0 2M 0 part > > > ??sda2 8:2 0 191M 0 part > > > ??sda3 8:3 0 1.8T 0 part > > > ??main-os (dm-0) 254:0 0 1.8T 0 lvm / > > > sdb 8:16 0 223.1G 0 disk > > > ??sdb1 8:17 0 188.2M 0 part /boot > > > ??sdb2 8:18 0 222.9G 0 part > > > ??main-ssd (dm-1) 254:1 0 40G 0 lvm > > > ? ??bcache0 253:0 0 182G 0 disk /sdb > > > ??main-db (dm-2) 254:2 0 182G 0 lvm > > > ??bcache0 253:0 0 182G 0 disk /sdb > > > > > > So a 40gig ssd (main-ssd, lvm2 volume) backed by a 180gig hdd (main-db, > > > lvm 2 > > > volume), using the writeback cache policy. Every other setting is at its > > > default. I'm running the 4.0-rc6 (!CONFIG_PREEMPT). After running fio > > > (using a > > > 30gb file) with a mix of sequential and random i/o and I'm getting the > > > following RCU warn: > > > > > > ====================================================== > > > INFO: rcu_sched self-detected stall on CPU > > > 4: (2099 ticks this GP) idle=fcf/140000000000001/0 > > > softirq=1031582/1031582 fqs=2100 > > > INFO: rcu_sched detected stalls on CPUs/tasks: > > > 4: (2099 ticks this GP) idle=fcf/140000000000001/0 > > > softirq=1031582/1031582 fqs=2100 > > > (detected by 16, t=2104 jiffies, g=2176431, c=2176430, q=3098) > > > Task dump for CPU 4: > > > bcache_gc R running task 12728 18115 2 0x00000008 > > > ffff880079e85720 fffffffffffffffc ffff88046c180e20 fffffffffffffffc > > > ffffffff81091693 fffffffffffffffc ffff88086aa3d000 ffff88046c180000 > > > ffff88086aa3d000 ffff88046c180000 ffff88086aa3d060 ffff88086aa3d000 > > > Call Trace: > > > [<ffffffff81091693>] ? __wake_up+0x53/0x70 > > > [<ffffffffa01103d4>] ? bch_btree_gc+0x2f4/0x560 [bcache] > > > [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0 > > > [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20 > > > [<ffffffffa0110678>] ? bch_gc_thread+0x38/0x120 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffff81070a9e>] ? kthread+0xce/0xf0 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > [<ffffffff815b8818>] ? ret_from_fork+0x58/0x90 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > (t=2228 jiffies g=2176431 c=2176430 q=3161) > > > Task dump for CPU 4: > > > bcache_gc R running task 12728 18115 2 0x00000008 > > > 0000000000000005 ffff88046fc83ca8 ffffffff8107720b 0000000000000004 > > > ffffffff8183d040 ffff88046fc83cc8 ffffffff810772af ffff88046fc83cc8 > > > ffffffff8183d100 ffff88046fc83cf8 ffffffff810a5101 ffff88046fc94500 > > > Call Trace: > > > <IRQ> [<ffffffff8107720b>] sched_show_task+0xcb/0x130 > > > [<ffffffff810772af>] dump_cpu_task+0x3f/0x50 > > > [<ffffffff810a5101>] rcu_dump_cpu_stacks+0x91/0xd0 > > > [<ffffffff810a68cf>] rcu_check_callbacks+0x65f/0xc30 > > > [<ffffffff81080ecc>] ? account_process_tick+0x6c/0x170 > > > [<ffffffff810acf29>] update_process_times+0x39/0x70 > > > [<ffffffff810beba0>] tick_sched_handle+0x40/0x50 > > > [<ffffffff810bedb2>] tick_sched_timer+0x52/0xa0 > > > [<ffffffff810afa16>] __run_hrtimer+0x86/0x1d0 > > > [<ffffffff810bed60>] ? tick_nohz_handler+0xc0/0xc0 > > > [<ffffffff810afd92>] hrtimer_interrupt+0x102/0x240 > > > [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache] > > > [<ffffffff81032e79>] local_apic_timer_interrupt+0x39/0x60 > > > [<ffffffff815bb355>] smp_apic_timer_interrupt+0x45/0x59 > > > [<ffffffffa0109920>] ? bch_ptr_invalid+0x10/0x10 [bcache] > > > [<ffffffff815b972d>] apic_timer_interrupt+0x6d/0x80 > > > <EOI> [<ffffffffa01117c5>] ? __bch_extent_invalid+0xa5/0xd0 [bcache] > > > [<ffffffffa0111721>] ? __bch_extent_invalid+0x1/0xd0 [bcache] > > > [<ffffffffa0111802>] ? bch_extent_invalid+0x12/0x20 [bcache] > > > [<ffffffffa011183d>] bch_extent_bad+0x2d/0x1c0 [bcache] > > > [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache] > > > [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache] > > > [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache] > > > [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache] > > > [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache] > > > [<ffffffff81091693>] ? __wake_up+0x53/0x70 > > > [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache] > > > [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0 > > > [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20 > > > [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffff81070a9e>] kthread+0xce/0xf0 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > [<ffffffff815b8818>] ret_from_fork+0x58/0x90 > > > > > > Naturally, checking > > > /sys/fs/bcache/b9bcddd1-7a9a-4f2f-88e6-cb5bef6abcf2/internal/btree_gc_max_duration_ms > > > shows: 31593 Clearly at some point the GC overhead becomes so large that > > > it > > > causes RCU grace period stalls. I'm puzzled since bch_btree_gc_finish(...) > > > is > > > not listed and this is the only function that pertains to bcache gc AND > > > executes code in RCU critical read section. > > > > > > In addition to that I also observed that the after this RCU stall warn > > > occurs > > > the bcache_gc thread hogs the machine at 100% rendering it unusable. I > > > managed > > > to get 2 call stack dumps via magic sysrq as follows: > > > > > > ============= > > > NMI backtrace for cpu 4 > > > CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4 > > > Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013 > > > task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000 > > > RIP: 0010:[<ffffffffa01098cb>] [<ffffffffa01098cb>] > > > bch_btree_iter_next_filter+0xb/0x50 [bcache] > > > RSP: 0018:ffff880868027bd8 EFLAGS: 00000202 > > > RAX: 0000000000000001 RBX: 0000000000002034 RCX: 000000000000000a > > > RDX: ffffffffa0109920 RSI: ffff88086aa3dcd0 RDI: ffff880868027c08 > > > RBP: ffff880868027bf8 R08: 0000000000000001 R09: 0000000000000001 > > > R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0 > > > R13: ffff880868027c08 R14: ffff880868027cf8 R15: ffff880868027dd8 > > > FS: 0000000000000000(0000) GS:ffff88046fc80000(0000) > > > knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0 > > > Stack: > > > 0000000000002034 ffff88086aa3dcd0 ffff880868027c08 ffff880868027cf8 > > > ffff880868027c78 ffffffffa0109c80 0000000000000004 0000000000000001 > > > ffff8807740101c0 ffff88077402a9f8 ffff88086aa3dc00 ffff88086aa3d000 > > > Call Trace: > > > [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache] > > > [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache] > > > [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache] > > > [<ffffffff81091693>] ? __wake_up+0x53/0x70 > > > [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache] > > > [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0 > > > [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20 > > > [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffff81070a9e>] kthread+0xce/0xf0 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > [<ffffffff815b8818>] ret_from_fork+0x58/0x90 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > Code: ff 48 89 d7 4c 29 cf eb a7 48 29 f2 48 89 d6 e9 18 ff ff ff 66 66 66 > > > 2e > > > 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 <0f> 1f 44 00 00 > > > 48 > > > 89 fb 49 89 f4 49 89 d6 0f 1f 80 00 00 00 00 > > > > > > > > > =========================================== > > > > > > NMI backtrace for cpu 4 > > > CPU: 4 PID: 18115 Comm: bcache_gc Not tainted 4.0.0-rc6bcache1-nikbor #4 > > > Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013 > > > task: ffff88086ab093e0 ti: ffff880868024000 task.ti: ffff880868024000 > > > RIP: 0010:[<ffffffffa0111916>] [<ffffffffa0111916>] > > > bch_extent_bad+0x106/0x1c0 [bcache] > > > RSP: 0018:ffff880868027ba8 EFLAGS: 00000202 > > > RAX: 000000000000bd2a RBX: ffff88077400a550 RCX: 000000000000000a > > > RDX: 0000000000000004 RSI: 00000000fc390004 RDI: ffff88046c180000 > > > RBP: ffff880868027bb8 R08: 0000000000000001 R09: 0000000000000000 > > > R10: 000007ffffffffff R11: 0000000000000008 R12: ffff88086aa3dcd0 > > > R13: ffff88077400a550 R14: ffffffffa0109920 R15: ffff880868027dd8 > > > FS: 0000000000000000(0000) GS:ffff88046fc80000(0000) > > > knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007f4f6b410008 CR3: 000000000180e000 CR4: 00000000001406e0 > > > Stack: > > > ffff880868027c08 ffff88086aa3dcd0 ffff880868027bc8 ffffffffa010992a > > > ffff880868027bf8 ffffffffa01098f9 00000000000014a6 ffff88086aa3dcd0 > > > ffff880868027c08 ffff880868027cf8 ffff880868027c78 ffffffffa0109c80 > > > Call Trace: > > > [<ffffffffa010992a>] bch_ptr_bad+0xa/0x10 [bcache] > > > [<ffffffffa01098f9>] bch_btree_iter_next_filter+0x39/0x50 [bcache] > > > [<ffffffffa0109c80>] btree_gc_count_keys+0x50/0x70 [bcache] > > > [<ffffffffa010ffbf>] btree_gc_recurse+0x1bf/0x2e0 [bcache] > > > [<ffffffffa010c4ac>] ? btree_gc_mark_node+0xdc/0x210 [bcache] > > > [<ffffffff81091693>] ? __wake_up+0x53/0x70 > > > [<ffffffffa01103d4>] bch_btree_gc+0x2f4/0x560 [bcache] > > > [<ffffffff8100180b>] ? __switch_to+0xbb/0x5f0 > > > [<ffffffff810911f0>] ? woken_wake_function+0x20/0x20 > > > [<ffffffffa0110678>] bch_gc_thread+0x38/0x120 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffffa0110640>] ? bch_btree_gc+0x560/0x560 [bcache] > > > [<ffffffff81070a9e>] kthread+0xce/0xf0 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > [<ffffffff815b8818>] ret_from_fork+0x58/0x90 > > > [<ffffffff810709d0>] ? kthread_freezable_should_stop+0x70/0x70 > > > Code: 33 25 ff 0f 00 00 48 8b 94 c7 40 0c 00 00 48 89 f0 48 8b 92 d8 0a 00 > > > 00 > > > 48 c1 e8 08 4c 21 d0 48 d3 e8 48 8d 04 40 0f b6 54 82 06 <40> 28 f2 80 fa > > > 80 > > > 0f 87 7e 00 00 00 0f b6 d2 83 fa 60 76 66 31 > > > > > > > > > In the mean time I'm running the stable 4.0.0 where I observe better > > > results ( > > > no bcache_gc thread hog but still the occasional stall warn) > > > > > > Regards, > > > Nikolay > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html