> Yeah, definitely. We haven't been aggressive about sending cephfs fixes > to stable but will start doing so soon. This would be very welcome! On a related note, I saw many RCU stalls like the one below. Looking through the commit logs I stumbled upon these maybe related fixes: 03974e8177b36d672eb59658f976f03cb77c1129 ceph: make sure request isn't in any waiting list when kicking request. 656e4382948d4b2c81bdaf707f1400f53eff2625 ceph: protect kick_requests() with mdsc->mutex 282c105225ec3229f344c5fced795b9e1e634440 ceph: fix kick_requests() They also apply cleanly with an offset to 3.14 and are all included since at least 3.18. Maybe they are also good candidates for inclusion in stable, if i haven't missed some hidden dependency on another patch. ------ 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087230] INFO: rcu_sched self-detected stall on CPU { 56} (t=10731918 jiffies g=3574092 c=3574091 q=0) 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087250] sending NMI to all CPUs: 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087279] NMI backtrace for cpu 56 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087286] CPU: 56 PID: 1276 Comm: kworker/56:2 Tainted: P W O 3.14.26-gentoo #1 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087288] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087314] Workqueue: ceph-msgr con_work [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087317] task: ffff883ff9f0b1e0 ti: ffff883d72e42000 task.ti: ffff883d72e42000 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087319] RIP: 0010:[<ffffffff8102750a>] [<ffffffff8102750a>] default_send_IPI_mask_sequence_phys+0x4e/0x68 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087332] RSP: 0000:ffff884026c03dd0 EFLAGS: 00000087 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087334] RAX: ffff884026c40000 RBX: 0000000000000039 RCX: 0000000000000039 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087336] RDX: fe00000000000000 RSI: 0000000000000002 RDI: fe00000000000000 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087338] RBP: ffff884026c03df8 R08: 0000000000000000 R09: ffffffff81886dc0 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087340] R10: ffff884026c03f00 R11: 0000000000000000 R12: 0000000000000096 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087342] R13: ffffffff81886dc0 R14: 0000000000000002 R15: 000000000000a10a 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087345] FS: 00007f545ab9f840(0000) GS:ffff884026c00000(0000) knlGS:0000000000000000 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087347] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087349] CR2: 00007fd749494288 CR3: 000000000180b000 CR4: 00000000000407e0 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087350] Stack: 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087352] 0000000000002710 ffffffff818389c0 0000000000000038 ffffffff818385c0 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087355] ffff884026c0c100 ffff884026c03e08 ffffffff8102aa14 ffff884026c03e20 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087358] ffffffff81027671 ffff884026c0c8c0 ffff884026c03e78 ffffffff8107dd8e 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087361] Call Trace: 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087363] <IRQ> 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087366] [<ffffffff8102aa14>] physflat_send_IPI_all+0x12/0x14 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087374] [<ffffffff81027671>] arch_trigger_all_cpu_backtrace+0x4d/0x80 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087380] [<ffffffff8107dd8e>] rcu_check_callbacks+0x1d1/0x4e0 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087385] [<ffffffff81041d77>] update_process_times+0x38/0x60 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087389] [<ffffffff8108650a>] tick_sched_handle+0x35/0x37 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087392] [<ffffffff810869c5>] tick_sched_timer+0x35/0x53 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087397] [<ffffffff81052d05>] __run_hrtimer.isra.25+0x72/0xcb 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087401] [<ffffffff810533fc>] hrtimer_interrupt+0xe6/0x1c8 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087404] [<ffffffff81026453>] local_apic_timer_interrupt+0x4f/0x52 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087407] [<ffffffff81026695>] smp_apic_timer_interrupt+0x2b/0x3c 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087414] [<ffffffff813ccd8a>] apic_timer_interrupt+0x6a/0x70 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087415] <EOI> 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087417] [<ffffffff811bdee9>] ? rb_next+0x2d/0x3d 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087435] [<ffffffffa0327dff>] kick_requests+0x2f5/0x38d [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087448] [<ffffffffa0328c91>] ceph_osdc_handle_map+0x2f7/0x4cc [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087459] [<ffffffffa032541f>] dispatch+0x588/0x5d2 [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087470] [<ffffffffa032541f>] ? dispatch+0x588/0x5d2 [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087480] [<ffffffffa0321b1a>] con_work+0xdb5/0x2374 [libceph] 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087486] [<ffffffff8105db76>] ? vtime_common_task_switch+0x25/0x28 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087491] [<ffffffff8104ba9f>] process_one_work+0x154/0x221 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087494] [<ffffffff8104c1e2>] worker_thread+0x13e/0x1d7 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087497] [<ffffffff8104c0a4>] ? cancel_delayed_work_sync+0x10/0x10 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087500] [<ffffffff81050cc5>] kthread+0xb2/0xba 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087503] [<ffffffff81050c13>] ? __kthread_parkme+0x62/0x62 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087506] [<ffffffff813cc13c>] ret_from_fork+0x7c/0xb0 2014-12-23T00:44:39+01:00 kaa-103 kernel: [252114.087509] [<ffffffff81050c13>] ? __kthread_parkme+0x62/0x62 On Tue, Jan 6, 2015 at 3:32 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Tue, 6 Jan 2015, Markus Blank-Burian wrote: >> Hi, >> >> as discussed in http://tracker.ceph.com/issues/10450 the 3.14 kernel >> sometimes hits a NULL pointer dereference if the MDS server crashes. >> The corresponding fix is in commit >> 00bd8edb861eb41d274938cfc0338999d9c593a3 which only adds a list_empty >> check. The patch applies cleanly with a -1 offset to the 3.14 tree and >> is included in mainline kernel since 3.15. >> Can this patch be included in one of the next stable releases? > > backport. > > Greg, do you need a patch sent to stable@ or is the sha1 above enough? > > Thanks! > sage > -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html