Awesome. Thanks Alex. I'll eagerly await 0.48 once it has finished QA. - Travis On Tue, Jun 19, 2012 at 2:45 PM, Alex Elder <elder@xxxxxxxxxxxxx> wrote: > On 06/19/2012 01:32 PM, Travis Rhoden wrote: >> Hey folks, >> >> Ran into this today. Not sure what I did wrong. =) > > It appears you are running Linux 3.2.0. This has symptoms that > could be explained by a bug that has been fixed in newer Ceph > code. Specifically, I think this is the fix that, without it, > you might see something like this: > > rbd: don't drop the rbd_id too early > > https://github.com/ceph/ceph-client/commit/32eec68d2f233e8a6ae1cd326022f6862e2b9ce3 > > > -Alex > >> I had an RBD successfully mounted and was done with it. Proceeded to >> do the following: >> >> root@spcnode2:~# ls /sys/bus/rbd/devices/ >> 0 >> root@spcnode2:~# echo 0 > /sys/bus/rbd/remove >> root@spcnode2:~# ls /sys/bus/rbd/devices/ <--- At this point, I >> believe the RBD has been successfully removed >> >> ---- About an hour passes where I am messing with my ceph cluster. >> No other commands are run on this machine ---- >> ---- New cluster is up. Time to mount my new RBD >> >> root@spcnode2:~# echo "10.55.30.0,10.55.30.1,10.55.30.2 >> name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd >> perftest" | tee /sys/bus/rbd/add >> 10.55.30.0,10.55.30.1,10.55.30.2 >> name=admin,secret=AQCNv+BPoPQENBAAxlm39kJ5XteNxg2S/dulXw== rbd >> perftest >> Segmentation fault >> >> Well that's ugly. What's in syslog? >> >> Jun 19 11:16:56 spcnode2 kernel: [76564.387890] ------------[ cut here >> ]------------ >> Jun 19 11:16:56 spcnode2 kernel: [76564.392569] WARNING: at >> /build/buildd/linux-3.2.0/fs/sysfs/inode.c:324 >> sysfs_hash_and_remove+0xa9/0xb0() >> Jun 19 11:16:56 spcnode2 kernel: [76564.402233] Hardware name: Relion 1702 >> Jun 19 11:16:56 spcnode2 kernel: [76564.406079] sysfs: can not remove >> 'bdi', no directory >> Jun 19 11:16:56 spcnode2 kernel: [76564.411268] Modules linked in: rbd >> libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE >> xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack >> iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 >> ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables >> kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd >> fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm >> ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp >> libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q >> garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma >> usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate >> libcrc32c >> Jun 19 11:16:56 spcnode2 kernel: [76564.477972] Pid: 6924, comm: bash >> Tainted: G D W 3.2.0-25-generic #40-Ubuntu >> Jun 19 11:16:56 spcnode2 kernel: [76564.485837] Call Trace: >> Jun 19 11:16:56 spcnode2 kernel: [76564.488394] [<ffffffff810672af>] >> warn_slowpath_common+0x7f/0xc0 >> Jun 19 11:16:56 spcnode2 kernel: [76564.494511] [<ffffffff810673a6>] >> warn_slowpath_fmt+0x46/0x50 >> Jun 19 11:16:56 spcnode2 kernel: [76564.500348] [<ffffffff81192958>] >> ? iput_final+0xe8/0x210 >> Jun 19 11:16:56 spcnode2 kernel: [76564.505888] [<ffffffff811ebc59>] >> sysfs_hash_and_remove+0xa9/0xb0 >> Jun 19 11:16:56 spcnode2 kernel: [76564.512082] [<ffffffff811ee356>] >> sysfs_remove_link+0x26/0x30 >> Jun 19 11:16:56 spcnode2 kernel: [76564.517959] [<ffffffff812fb960>] >> del_gendisk+0x100/0x260 >> Jun 19 11:16:56 spcnode2 kernel: [76564.523448] [<ffffffffa0623868>] >> rbd_dev_release+0x108/0x110 [rbd] >> Jun 19 11:16:56 spcnode2 kernel: [76564.529861] [<ffffffff813f1407>] >> device_release+0x27/0xa0 >> Jun 19 11:16:56 spcnode2 kernel: [76564.535432] [<ffffffff8130cfdc>] >> kobject_release+0x4c/0xa0 >> Jun 19 11:16:56 spcnode2 kernel: [76564.541163] [<ffffffff8130cf90>] >> ? kobject_del+0x40/0x40 >> Jun 19 11:16:56 spcnode2 kernel: [76564.546694] [<ffffffff8130e686>] >> kref_put+0x36/0x70 >> Jun 19 11:16:56 spcnode2 kernel: [76564.551764] [<ffffffff8130ce97>] >> kobject_put+0x27/0x60 >> Jun 19 11:16:56 spcnode2 kernel: [76564.557126] [<ffffffff8131d33c>] >> ? _kstrtoull+0x2c/0x90 >> Jun 19 11:16:56 spcnode2 kernel: [76564.562523] [<ffffffff813f1167>] >> put_device+0x17/0x20 >> Jun 19 11:16:56 spcnode2 kernel: [76564.567808] [<ffffffff813f225e>] >> device_unregister+0x1e/0x30 >> Jun 19 11:16:56 spcnode2 kernel: [76564.573647] [<ffffffffa06211ea>] >> rbd_remove+0x15a/0x160 [rbd] >> Jun 19 11:16:56 spcnode2 kernel: [76564.579594] [<ffffffff813f3c47>] >> bus_attr_store+0x27/0x30 >> Jun 19 11:16:56 spcnode2 kernel: [76564.585113] [<ffffffff811ebebf>] >> sysfs_write_file+0xef/0x170 >> Jun 19 11:16:56 spcnode2 kernel: [76564.590907] [<ffffffff81177f23>] >> vfs_write+0xb3/0x180 >> Jun 19 11:16:56 spcnode2 kernel: [76564.596158] [<ffffffff8117824a>] >> sys_write+0x4a/0x90 >> Jun 19 11:16:56 spcnode2 kernel: [76564.601258] [<ffffffff81665c42>] >> system_call_fastpath+0x16/0x1b >> Jun 19 11:16:56 spcnode2 kernel: [76564.607321] ---[ end trace >> ace27f1cbf93eeaa ]--- >> Jun 19 11:16:57 spcnode2 kernel: [76564.612447] BUG: unable to handle >> kernel NULL pointer dereference at 0000000000000079 >> Jun 19 11:16:57 spcnode2 kernel: [76564.620374] IP: >> [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110 >> Jun 19 11:16:57 spcnode2 kernel: [76564.626475] PGD 404514067 PUD >> 5f89cc067 PMD 0 >> Jun 19 11:16:57 spcnode2 kernel: [76564.630958] Oops: 0000 [#2] SMP >> Jun 19 11:16:57 spcnode2 kernel: [76564.634254] CPU 5 >> Jun 19 11:16:57 spcnode2 kernel: [76564.636113] Modules linked in: rbd >> libceph ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE >> xt_state ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp xt_conntrack >> iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 >> ipmi_devintf ipmi_si iptable_filter ipmi_msghandler ip_tables x_tables >> kvm_intel kvm bnep rfcomm bluetooth parport_pc ppdev nfsd nfs lockd >> fscache auth_rpcgss nfs_acl sunrpc ext2 xfs vesafb ib_iser rdma_cm >> ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp >> libiscsi scsi_transport_iscsi bridge mtdchar i7core_edac psmouse 8021q >> garp stp lp parport dm_multipath mac_hid serio_raw edac_core ioatdma >> usbhid hid sfc mtd i2c_algo_bit igb mdio dca btrfs zlib_deflate >> libcrc32c >> Jun 19 11:16:57 spcnode2 kernel: [76564.701251] >> Jun 19 11:16:57 spcnode2 kernel: [76564.702740] Pid: 6924, comm: bash >> Tainted: G D W 3.2.0-25-generic #40-Ubuntu Penguin Computing >> Relion 1702/X8DTT >> Jun 19 11:16:57 spcnode2 kernel: [76564.713752] RIP: >> 0010:[<ffffffff811ed770>] [<ffffffff811ed770>] >> sysfs_find_dirent+0x10/0x110 >> Jun 19 11:16:57 spcnode2 kernel: [76564.722319] RSP: >> 0018:ffff8805f8f9bc58 EFLAGS: 00010246 >> Jun 19 11:16:57 spcnode2 kernel: [76564.727719] RAX: ffff8806186edbc0 >> RBX: 0000000000000000 RCX: 00000000000988e6 >> Jun 19 11:16:57 spcnode2 kernel: [76564.734892] RDX: ffffffff81a0158d >> RSI: 0000000000000000 RDI: 0000000000000000 >> Jun 19 11:16:57 spcnode2 kernel: [76564.742083] RBP: ffff8805f8f9bc78 >> R08: ffffea00303f6580 R09: ffffffff8130cfe9 >> Jun 19 11:16:57 spcnode2 kernel: [76564.749221] R10: ffff880c0fe5de28 >> R11: 0000000000000000 R12: 0000000000000000 >> Jun 19 11:16:57 spcnode2 kernel: [76564.756437] R13: ffffffff81a0158d >> R14: ffff880bf45a5a50 R15: ffff880c0fd1de18 >> Jun 19 11:16:57 spcnode2 kernel: [76564.763630] FS: >> 00007fe308eb7700(0000) GS:ffff880c3fc20000(0000) >> knlGS:0000000000000000 >> Jun 19 11:16:57 spcnode2 kernel: [76564.771717] CS: 0010 DS: 0000 ES: >> 0000 CR0: 0000000080050033 >> Jun 19 11:16:57 spcnode2 kernel: [76564.777549] CR2: 0000000000000079 >> CR3: 00000005f89cd000 CR4: 00000000000006e0 >> Jun 19 11:16:57 spcnode2 kernel: [76564.784738] DR0: 0000000000000000 >> DR1: 0000000000000000 DR2: 0000000000000000 >> Jun 19 11:16:57 spcnode2 kernel: [76564.791877] DR3: 0000000000000000 >> DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> Jun 19 11:16:57 spcnode2 kernel: [76564.798991] Process bash (pid: >> 6924, threadinfo ffff8805f8f9a000, task ffff8806186edbc0) >> Jun 19 11:16:57 spcnode2 kernel: [76564.807295] Stack: >> Jun 19 11:16:57 spcnode2 kernel: [76564.809302] 0000000000000000 >> 0000000000000000 ffffffff81a0158d ffff880bf45a5a50 >> Jun 19 11:16:57 spcnode2 kernel: [76564.816832] ffff8805f8f9bca8 >> ffffffff811ed9bc ffff8805f8f9bcd8 ffffffff81c34b00 >> Jun 19 11:16:57 spcnode2 kernel: [76564.824341] ffff880605b36878 >> 0000000000000000 ffff8805f8f9bce8 ffffffff811efa15 >> Jun 19 11:16:57 spcnode2 kernel: [76564.831894] Call Trace: >> Jun 19 11:16:57 spcnode2 kernel: [76564.834337] [<ffffffff811ed9bc>] >> sysfs_get_dirent+0x3c/0x80 >> Jun 19 11:16:57 spcnode2 kernel: [76564.840041] [<ffffffff811efa15>] >> sysfs_remove_group+0x35/0x100 >> Jun 19 11:16:57 spcnode2 kernel: [76564.846029] [<ffffffff810fee24>] >> blk_trace_remove_sysfs+0x14/0x20 >> Jun 19 11:16:57 spcnode2 kernel: [76564.852195] [<ffffffff812f50d9>] >> blk_unregister_queue+0x59/0x80 >> Jun 19 11:16:57 spcnode2 kernel: [76564.858270] [<ffffffff812fb97b>] >> del_gendisk+0x11b/0x260 >> Jun 19 11:16:57 spcnode2 kernel: [76564.863661] [<ffffffffa0623868>] >> rbd_dev_release+0x108/0x110 [rbd] >> Jun 19 11:16:57 spcnode2 kernel: [76564.869962] [<ffffffff813f1407>] >> device_release+0x27/0xa0 >> Jun 19 11:16:57 spcnode2 kernel: [76564.875448] [<ffffffff8130cfdc>] >> kobject_release+0x4c/0xa0 >> Jun 19 11:16:57 spcnode2 kernel: [76564.881061] [<ffffffff8130cf90>] >> ? kobject_del+0x40/0x40 >> Jun 19 11:16:57 spcnode2 kernel: [76564.886502] [<ffffffff8130e686>] >> kref_put+0x36/0x70 >> Jun 19 11:16:57 spcnode2 kernel: [76564.891521] [<ffffffff8130ce97>] >> kobject_put+0x27/0x60 >> Jun 19 11:16:57 spcnode2 kernel: [76564.896739] [<ffffffff8131d33c>] >> ? _kstrtoull+0x2c/0x90 >> Jun 19 11:16:57 spcnode2 kernel: [76564.902043] [<ffffffff813f1167>] >> put_device+0x17/0x20 >> Jun 19 11:16:57 spcnode2 kernel: [76564.907226] [<ffffffff813f225e>] >> device_unregister+0x1e/0x30 >> Jun 19 11:16:57 spcnode2 kernel: [76564.913057] [<ffffffffa06211ea>] >> rbd_remove+0x15a/0x160 [rbd] >> Jun 19 11:16:57 spcnode2 kernel: [76564.918881] [<ffffffff813f3c47>] >> bus_attr_store+0x27/0x30 >> Jun 19 11:16:57 spcnode2 kernel: [76564.924436] [<ffffffff811ebebf>] >> sysfs_write_file+0xef/0x170 >> Jun 19 11:16:57 spcnode2 kernel: [76564.930174] [<ffffffff81177f23>] >> vfs_write+0xb3/0x180 >> Jun 19 11:16:57 spcnode2 kernel: [76564.935450] [<ffffffff8117824a>] >> sys_write+0x4a/0x90 >> Jun 19 11:16:57 spcnode2 kernel: [76564.940497] [<ffffffff81665c42>] >> system_call_fastpath+0x16/0x1b >> Jun 19 11:16:57 spcnode2 kernel: [76564.946488] Code: 41 5c 41 5d 41 >> 5e 41 5f 5d c3 90 4c 89 f7 e8 68 df 46 00 eb c3 0f 0b 0f 1f 40 00 55 >> 48 89 e5 41 56 41 55 41 54 53 66 66 66 66 90 <80> 7f 79 00 4c 8b 67 70 >> 49 89 d6 48 89 f3 0f 95 c0 48 85 f6 0f >> Jun 19 11:16:57 spcnode2 kernel: [76564.966571] RIP >> [<ffffffff811ed770>] sysfs_find_dirent+0x10/0x110 >> Jun 19 11:16:57 spcnode2 kernel: [76564.972826] RSP <ffff8805f8f9bc58> >> Jun 19 11:16:57 spcnode2 kernel: [76564.976331] CR2: 0000000000000079 >> Jun 19 11:16:57 spcnode2 kernel: [76564.979725] ---[ end trace >> ace27f1cbf93eeab ]--- >> >> >> Had to do a hard reset on the machine afterwards. >> >> The machine mounting the RBD is running Ubuntu 12.04, and is not >> hosting any OSDs or MONs. >> root@spcnode2:~# uname -a >> Linux spcnode2 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC >> 2012 x86_64 x86_64 x86_64 GNU/Linux >> root@spcnode2:~# ceph --version >> ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) >> >> - Travis >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html