On 06/19/2012 01:32 PM, Travis Rhoden wrote: > Hey folks, > > Ran into this today. Not sure what I did wrong. =) It appears you are running Linux 3.2.0. This has symptoms that could be explained by a bug that has been fixed in newer Ceph code. Specifically, I think this is the fix that, without it, you might see something like this: rbd: don't drop the rbd_id too early https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3 -Alex > I had an RBD successfully mounted and was done with it. Proceeded to > do the following: > > root@spcnode2:~# ls /sys/bus/rbd/devices/ > 0 > root@spcnode2:~# echo 0 > /sys/bus/rbd/remove > root@spcnode2:~# ls /sys/bus/rbd/devices/ <--- At this point, I > believe the RBD has been successfully removed > > ---- About an hour passes where I am messing with my ceph cluster. > No other commands are run on this machine ---- > ---- New cluster is up. Time to mount my new RBD > > root@spcnode2:~# echo "10.55.30.0,10.55.30.1,10.55.30.2 > name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd > perftest" | tee /sys/bus/rbd/add > 10.55.30.0,10.55.30.1,10.55.30.2 > name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd > perftest > Segmentation fault > > Well that's ugly. What's in syslog? > > Jun 19 11:16:56 spcnode2 kernel: [76564.387890] ------------[ cut here > ]------------ > Jun 19 11:16:56 spcnode2 kernel: [76564.392569] WARNING: at > /build/buildd/linux-3.2.0/fs/sysfs/inode.c:324 > sysfs_hash_and_remove+0xa9/0xb0() > Jun 19 11:16:56 spcnode2 kernel: [76564.402233] Hardware name: Relion 1702 > Jun 19 11:16:56 spcnode2 kernel: [76564.406079] sysfs: can not remove > 'bdi', no directory > Jun 19 11:16:56 spcnode2 kernel: [76564.411268] Modules linked in: rbd > libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE > xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack > iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables > kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd > fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm > ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp > libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q > garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma > usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate > libcrc32c > Jun 19 11:16:56 spcnode2 kernel: [76564.477972] Pid: 6924, comm: bash > Tainted: G D W 3.2.0-25-generic #40-Ubuntu > Jun 19 11:16:56 spcnode2 kernel: [76564.485837] Call Trace: > Jun 19 11:16:56 spcnode2 kernel: [76564.488394] [<ffffffff810672af>] > warn_slowpath_common+0x7f/0xc0 > Jun 19 11:16:56 spcnode2 kernel: [76564.494511] [<ffffffff810673a6>] > warn_slowpath_fmt+0x46/0x50 > Jun 19 11:16:56 spcnode2 kernel: [76564.500348] [<ffffffff81192958>] > ? iput_final+0xe8/0x210 > Jun 19 11:16:56 spcnode2 kernel: [76564.505888] [<ffffffff811ebc59>] > sysfs_hash_and_remove+0xa9/0xb0 > Jun 19 11:16:56 spcnode2 kernel: [76564.512082] [<ffffffff811ee356>] > sysfs_remove_link+0x26/0x30 > Jun 19 11:16:56 spcnode2 kernel: [76564.517959] [<ffffffff812fb960>] > del_gendisk+0x100/0x260 > Jun 19 11:16:56 spcnode2 kernel: [76564.523448] [<ffffffffa0623868>] > rbd_dev_release+0x108/0x110 [rbd] > Jun 19 11:16:56 spcnode2 kernel: [76564.529861] [<ffffffff813f1407>] > device_release+0x27/0xa0 > Jun 19 11:16:56 spcnode2 kernel: [76564.535432] [<ffffffff8130cfdc>] > kobject_release+0x4c/0xa0 > Jun 19 11:16:56 spcnode2 kernel: [76564.541163] [<ffffffff8130cf90>] > ? kobject_del+0x40/0x40 > Jun 19 11:16:56 spcnode2 kernel: [76564.546694] [<ffffffff8130e686>] > kref_put+0x36/0x70 > Jun 19 11:16:56 spcnode2 kernel: [76564.551764] [<ffffffff8130ce97>] > kobject_put+0x27/0x60 > Jun 19 11:16:56 spcnode2 kernel: [76564.557126] [<ffffffff8131d33c>] > ? _kstrtoull+0x2c/0x90 > Jun 19 11:16:56 spcnode2 kernel: [76564.562523] [<ffffffff813f1167>] > put_device+0x17/0x20 > Jun 19 11:16:56 spcnode2 kernel: [76564.567808] [<ffffffff813f225e>] > device_unregister+0x1e/0x30 > Jun 19 11:16:56 spcnode2 kernel: [76564.573647] [<ffffffffa06211ea>] > rbd_remove+0x15a/0x160 [rbd] > Jun 19 11:16:56 spcnode2 kernel: [76564.579594] [<ffffffff813f3c47>] > bus_attr_store+0x27/0x30 > Jun 19 11:16:56 spcnode2 kernel: [76564.585113] [<ffffffff811ebebf>] > sysfs_write_file+0xef/0x170 > Jun 19 11:16:56 spcnode2 kernel: [76564.590907] [<ffffffff81177f23>] > vfs_write+0xb3/0x180 > Jun 19 11:16:56 spcnode2 kernel: [76564.596158] [<ffffffff8117824a>] > sys_write+0x4a/0x90 > Jun 19 11:16:56 spcnode2 kernel: [76564.601258] [<ffffffff81665c42>] > system_call_fastpath+0x16/0x1b > Jun 19 11:16:56 spcnode2 kernel: [76564.607321] ---[ end trace > ace27f1cbf93eeaa ]--- > Jun 19 11:16:57 spcnode2 kernel: [76564.612447] BUG: unable to handle > kernel NULL pointer dereference at 0000000000000079 > Jun 19 11:16:57 spcnode2 kernel: [76564.620374] IP: > [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110 > Jun 19 11:16:57 spcnode2 kernel: [76564.626475] PGD 404514067 PUD > 5f89cc067 PMD 0 > Jun 19 11:16:57 spcnode2 kernel: [76564.630958] Oops: 0000 [#2] SMP > Jun 19 11:16:57 spcnode2 kernel: [76564.634254] CPU 5 > Jun 19 11:16:57 spcnode2 kernel: [76564.636113] Modules linked in: rbd > libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE > xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack > iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables > kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd > fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm > ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp > libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q > garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma > usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate > libcrc32c > Jun 19 11:16:57 spcnode2 kernel: [76564.701251] > Jun 19 11:16:57 spcnode2 kernel: [76564.702740] Pid: 6924, comm: bash > Tainted: G D W 3.2.0-25-generic #40-Ubuntu Penguin Computing > Relion 1702/X8DTT > Jun 19 11:16:57 spcnode2 kernel: [76564.713752] RIP: > 0010:[<ffffffff811ed770>] [<ffffffff811ed770>] > sysfs_find_dirent+0x10/0x110 > Jun 19 11:16:57 spcnode2 kernel: [76564.722319] RSP: > 0018:ffff8805f8f9bc58 EFLAGS: 00010246 > Jun 19 11:16:57 spcnode2 kernel: [76564.727719] RAX: ffff8806186edbc0 > RBX: 0000000000000000 RCX: 00000000000988e6 > Jun 19 11:16:57 spcnode2 kernel: [76564.734892] RDX: ffffffff81a0158d > RSI: 0000000000000000 RDI: 0000000000000000 > Jun 19 11:16:57 spcnode2 kernel: [76564.742083] RBP: ffff8805f8f9bc78 > R08: ffffea00303f6580 R09: ffffffff8130cfe9 > Jun 19 11:16:57 spcnode2 kernel: [76564.749221] R10: ffff880c0fe5de28 > R11: 0000000000000000 R12: 0000000000000000 > Jun 19 11:16:57 spcnode2 kernel: [76564.756437] R13: ffffffff81a0158d > R14: ffff880bf45a5a50 R15: ffff880c0fd1de18 > Jun 19 11:16:57 spcnode2 kernel: [76564.763630] FS: > 00007fe308eb7700(0000) GS:ffff880c3fc20000(0000) > knlGS:0000000000000000 > Jun 19 11:16:57 spcnode2 kernel: [76564.771717] CS: 0010 DS: 0000 ES: > 0000 CR0: 0000000080050033 > Jun 19 11:16:57 spcnode2 kernel: [76564.777549] CR2: 0000000000000079 > CR3: 00000005f89cd000 CR4: 00000000000006e0 > Jun 19 11:16:57 spcnode2 kernel: [76564.784738] DR0: 0000000000000000 > DR1: 0000000000000000 DR2: 0000000000000000 > Jun 19 11:16:57 spcnode2 kernel: [76564.791877] DR3: 0000000000000000 > DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Jun 19 11:16:57 spcnode2 kernel: [76564.798991] Process bash (pid: > 6924, threadinfo ffff8805f8f9a000, task ffff8806186edbc0) > Jun 19 11:16:57 spcnode2 kernel: [76564.807295] Stack: > Jun 19 11:16:57 spcnode2 kernel: [76564.809302] 0000000000000000 > 0000000000000000 ffffffff81a0158d ffff880bf45a5a50 > Jun 19 11:16:57 spcnode2 kernel: [76564.816832] ffff8805f8f9bca8 > ffffffff811ed9bc ffff8805f8f9bcd8 ffffffff81c34b00 > Jun 19 11:16:57 spcnode2 kernel: [76564.824341] ffff880605b36878 > 0000000000000000 ffff8805f8f9bce8 ffffffff811efa15 > Jun 19 11:16:57 spcnode2 kernel: [76564.831894] Call Trace: > Jun 19 11:16:57 spcnode2 kernel: [76564.834337] [<ffffffff811ed9bc>] > sysfs_get_dirent+0x3c/0x80 > Jun 19 11:16:57 spcnode2 kernel: [76564.840041] [<ffffffff811efa15>] > sysfs_remove_group+0x35/0x100 > Jun 19 11:16:57 spcnode2 kernel: [76564.846029] [<ffffffff810fee24>] > blk_trace_remove_sysfs+0x14/0x20 > Jun 19 11:16:57 spcnode2 kernel: [76564.852195] [<ffffffff812f50d9>] > blk_unregister_queue+0x59/0x80 > Jun 19 11:16:57 spcnode2 kernel: [76564.858270] [<ffffffff812fb97b>] > del_gendisk+0x11b/0x260 > Jun 19 11:16:57 spcnode2 kernel: [76564.863661] [<ffffffffa0623868>] > rbd_dev_release+0x108/0x110 [rbd] > Jun 19 11:16:57 spcnode2 kernel: [76564.869962] [<ffffffff813f1407>] > device_release+0x27/0xa0 > Jun 19 11:16:57 spcnode2 kernel: [76564.875448] [<ffffffff8130cfdc>] > kobject_release+0x4c/0xa0 > Jun 19 11:16:57 spcnode2 kernel: [76564.881061] [<ffffffff8130cf90>] > ? kobject_del+0x40/0x40 > Jun 19 11:16:57 spcnode2 kernel: [76564.886502] [<ffffffff8130e686>] > kref_put+0x36/0x70 > Jun 19 11:16:57 spcnode2 kernel: [76564.891521] [<ffffffff8130ce97>] > kobject_put+0x27/0x60 > Jun 19 11:16:57 spcnode2 kernel: [76564.896739] [<ffffffff8131d33c>] > ? _kstrtoull+0x2c/0x90 > Jun 19 11:16:57 spcnode2 kernel: [76564.902043] [<ffffffff813f1167>] > put_device+0x17/0x20 > Jun 19 11:16:57 spcnode2 kernel: [76564.907226] [<ffffffff813f225e>] > device_unregister+0x1e/0x30 > Jun 19 11:16:57 spcnode2 kernel: [76564.913057] [<ffffffffa06211ea>] > rbd_remove+0x15a/0x160 [rbd] > Jun 19 11:16:57 spcnode2 kernel: [76564.918881] [<ffffffff813f3c47>] > bus_attr_store+0x27/0x30 > Jun 19 11:16:57 spcnode2 kernel: [76564.924436] [<ffffffff811ebebf>] > sysfs_write_file+0xef/0x170 > Jun 19 11:16:57 spcnode2 kernel: [76564.930174] [<ffffffff81177f23>] > vfs_write+0xb3/0x180 > Jun 19 11:16:57 spcnode2 kernel: [76564.935450] [<ffffffff8117824a>] > sys_write+0x4a/0x90 > Jun 19 11:16:57 spcnode2 kernel: [76564.940497] [<ffffffff81665c42>] > system_call_fastpath+0x16/0x1b > Jun 19 11:16:57 spcnode2 kernel: [76564.946488] Code: 41 5c 41 5d 41 > 5e 41 5f 5d c3 90 4c 89 f7 e8 68 df 46 00 eb c3 0f 0b 0f 1f 40 00 55 > 48 89 e5 41 56 41 55 41 54 53 66 66 66 66 90 <80> 7f 79 00 4c 8b 67 70 > 49 89 d6 48 89 f3 0f 95 c0 48 85 f6 0f > Jun 19 11:16:57 spcnode2 kernel: [76564.966571] RIP > [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110 > Jun 19 11:16:57 spcnode2 kernel: [76564.972826] RSP <ffff8805f8f9bc58> > Jun 19 11:16:57 spcnode2 kernel: [76564.976331] CR2: 0000000000000079 > Jun 19 11:16:57 spcnode2 kernel: [76564.979725] ---[ end trace > ace27f1cbf93eeab ]--- > > > Had to do a hard reset on the machine afterwards. > > The machine mounting the RBD is running Ubuntu 12.04, and is not > hosting any OSDs or MONs. > root@spcnode2:~# uname -a > Linux spcnode2 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC > 2012 x86_64 x86_64 x86_64 GNU/Linux > root@spcnode2:~# ceph --version > ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) > > - Travis > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html