Hello, Please queue 45002267e8d2 ("crush: ensuring at most num-rep osds are selected", went into 4.1-rc1) for 3.15+. It doesn't say that in the commit message, but it fixes http://tracker.ceph.com/issues/9492, which on the kernel side manifests as the following: [ 17.027382] BUG: unable to handle kernel NULL pointer dereference at 000000000000000c [ 17.027382] IP: [<ffffffff8169f29a>] crush_choose_firstn+0x2ea/0x390 [ 17.027382] PGD 3a86f067 PUD 3a971067 PMD 0 [ 17.027382] Oops: 0000 [#1] PREEMPT SMP [ 17.027382] Modules linked in: [ 17.027382] CPU: 0 PID: 1358 Comm: rbd Tainted: G W 3.16.0-vm #15 [ 17.027382] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 17.027382] task: ffff88003e26d190 ti: ffff88003b0fc000 task.ti: ffff88003b0fc000 [ 17.027382] RIP: 0010:[<ffffffff8169f29a>] [<ffffffff8169f29a>] crush_choose_firstn+0x2ea/0x390 [ 17.027382] RSP: 0000:ffff88003b0ff960 EFLAGS: 00010246 [ 17.027382] RAX: 0000000000000000 RBX: ffff88003a8ca514 RCX: 0000000000000000 [ 17.027382] RDX: 00000000ffffffff RSI: ffff88003a8ca51c RDI: ffff88003b269dc0 [ 17.027382] RBP: ffff88003b0ffa48 R08: 00000000839e78ac R09: 0000000000000003 [ 17.027382] R10: 0000000000000000 R11: ffff88003b269dc0 R12: 0000000000000000 [ 17.027382] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88003b242b78 [ 17.027382] FS: 00007fa5bb8bd900(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 17.027382] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.027382] CR2: 000000000000000c CR3: 000000003a970000 CR4: 00000000000006b0 [ 17.027382] Stack: [ 17.027382] 0000000000000005 ffff88003e26d190 ffffffff82543f30 ffff88003e26d978 [ 17.027382] ffff88003b0ff998 ffffffff81011396 ffffffff81fc2380 ffff88003b0ff9b0 [ 17.027382] ffffffff8107f6cb ffff88003e26d978 ffff88003b0ffa68 0000000000000000 [ 17.027382] Call Trace: [ 17.027382] [<ffffffff81011396>] ? save_stack_trace+0x26/0x50 [ 17.027382] [<ffffffff8107f6cb>] ? save_trace+0x3b/0xc0 [ 17.027382] [<ffffffff8169fa00>] crush_do_rule+0x2b0/0x420 [ 17.027382] [<ffffffff8169e3f6>] ceph_calc_pg_acting+0x166/0x690 [ 17.027382] [<ffffffff81695538>] __map_request+0x1b8/0x7c0 [ 17.027382] [<ffffffff8108183d>] ? trace_hardirqs_on_caller+0x16d/0x200 [ 17.027382] [<ffffffff810818dd>] ? trace_hardirqs_on+0xd/0x10 [ 17.027382] [<ffffffff81694a12>] ? __schedule_osd_timeout+0x32/0x40 [ 17.027382] [<ffffffff81696510>] ? __register_request+0x180/0x190 [ 17.027382] [<ffffffff816973ac>] __ceph_osdc_start_request+0x3c/0x150 [ 17.027382] [<ffffffff81697502>] ceph_osdc_start_request+0x42/0x70 [ 17.027382] [<ffffffff814e9eff>] rbd_obj_request_submit+0x7f/0x90 [ 17.027382] [<ffffffff814ec4da>] rbd_obj_method_sync.constprop.29+0x17a/0x220 [ 17.027382] [<ffffffff814f0614>] rbd_dev_image_probe+0x114/0xe20 [ 17.027382] [<ffffffff81086a9b>] ? __init_rwsem+0x4b/0x70 [ 17.027382] [<ffffffff814f19bd>] do_rbd_add.isra.23+0x69d/0xdb0 [ 17.027382] [<ffffffff814f20df>] rbd_add_single_major+0xf/0x20 [ 17.027382] [<ffffffff814983c2>] bus_attr_store+0x22/0x30 [ 17.027382] [<ffffffff811ad38f>] sysfs_kf_write+0x3f/0x50 [ 17.027382] [<ffffffff811acc9f>] kernfs_fop_write+0xdf/0x160 [ 17.027382] [<ffffffff8113a943>] vfs_write+0xc3/0x1c0 [ 17.027382] [<ffffffff8113b394>] SyS_write+0x44/0xa0 [ 17.027382] [<ffffffff816b7e12>] system_call_fastpath+0x16/0x1b [ 17.027382] Code: ed 48 8b 75 58 8b 85 74 ff ff ff 4c 8b 55 80 48 89 4d a8 48 8d 34 8e 48 89 75 a0 8b 75 20 03 45 98 8d 56 ff 48 8b 75 b8 89 45 c0 <41> 8b 42 0c 48 8d 34 96 48 89 75 b0 8b 75 20 8d 4e 01 89 4d 88 [ 17.027382] RIP [<ffffffff8169f29a>] crush_choose_firstn+0x2ea/0x390 [ 17.027382] RSP <ffff88003b0ff960> [ 17.027382] CR2: 000000000000000c [ 17.941409] ---[ end trace ac6119939c0828c7 ]--- Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html