Re: couldn't use rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/03/2012 12:51 AM, Masuko Tomoya wrote:
Hi Josh,

Thank you for reply !

This might mean the rbd image list object can't be read for some
reason, or the rbd tool is doing something weird that the rados tool
isn't. Can you share the output of 'ceph -s' and
'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc 20
--debug-auth 20'?

The output of 'ceph -s' is below.
-------
root@ceph01:~# ceph -s
2012-02-03 17:01:47.881960    pg v33: 6 pgs: 6 active+clean+degraded;
0 KB data, 1056 KB used, 15357 MB / 15358 MB avail
2012-02-03 17:01:47.882583   mds e9: 1/1/1 up {0=0=up:creating}
2012-02-03 17:01:47.882733   osd e21: 1 osds: 1 up, 1 in
2012-02-03 17:01:47.883042   log 2012-02-03 16:35:11.183897 osd.0
10.68.119.191:6801/2912 12 : [WRN] map e19 wrongly marked me down or
wrong addr
2012-02-03 17:01:47.883144   mon e1: 1 mons at {0=10.68.119.191:6789/0}

The output of 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20
--debug-monc 20 --debug-auth 20' is below.
------
root@ceph01:~# rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20
--debug-monc 20 --debug-auth 20

<snip>

2012-02-03 17:02:10.971391 7f88cbb91720 client.4106.objecter op_submit
oid rbd_directory @2 [read 0~0] tid 1 osd.0
2012-02-03 17:02:10.971465 7f88cbb91720 client.4106.objecter send_op 1 to osd.0
2012-02-03 17:02:10.971533 7f88cbb91720 -- 10.68.119.191:0/1003500 -->
10.68.119.191:6801/2912 -- osd_op(client.4106.0:1 rbd_directory [read
0~0] 2.30a98c1c) v1 -- ?+0 0x24664c0 con 0x24661b0

Everything above here is normal - the rbd tool connected to the
monitors, got the monmap and osdmap, and sent a request to read the
'rbd_directory' object.

<snip>

2012-02-03 17:02:25.969338 7f88c7261710 client.4106.objecter  tid 1 on
osd.0 is laggy

This means the osd isn't responding to the read. Check the osd log for
errors. If there's nothing obvious, add this to the osd section of your
ceph.conf and restart the osd:

    debug osd = 20
    debug ms = 1
    debug filestore = 20

Then run 'rbd ls' and look at what happens after
'osd_op.*rbd_directory' appears in the osd log.

<rados lspools log>


I compared those logs and found there is differences.
'rbd list'
2012-02-03 17:02:10.971770 7f88c9366710 -- 10.68.119.191:0/1003500<==
mon.0 10.68.119.191:6789/0 10 ==== osd_map(21..21 src has 1..21) v2
==== 1284+0+0 (473305567 0 0) 0x24655a0 con 0x24603e0
2012-02-03 17:02:10.971789 7f88c9366710 client.4106.objecter
handle_osd_map ignoring epochs [21,21]<= 21
2012-02-03 17:02:10.971801 7f88c9366710 client.4106.objecter
dump_active .. 0 homeless
2012-02-03 17:02:10.971815 7f88c9366710 client.4106.objecter 1
2.30a98c1c      osd.0   rbd_directory   [read 0~0]
--(snip)--

'rados lspools'
2012-02-03 17:11:52.866072 7f9c5764b710 -- 10.68.119.191:0/1003868<==
mon.0 10.68.119.191:6789/0 7 ==== osd_map(21..21 src has 1..21) v2
==== 1284+0+0 (473305567 0 0) 0x770a70 con 0x771440
2012-02-03 17:11:52.866103 7f9c5764b710 client.4107.objecter
handle_osd_map got epochs [21,21]>  0
2012-02-03 17:11:52.866111 7f9c5764b710 client.4107.objecter
handle_osd_map decoding full epoch 21
2012-02-03 17:11:52.866272 7f9c5764b710 client.4107.objecter
dump_active .. 0 homeless
data
metadata
rbd
--(snip)--

What do these logs mean ?

The difference is that 'rbd ls' talks to the monitors and osds, while 'rados lspools' just needs to talk to the monitors. The objecter dump_active part is listing in-flight osd requests.


*ceph cluster is configured on single server.
*server is ubuntu 10.10 maverick.
*ceph, librados2 and librbd1 packages are installed.
  (version: 0.38-1maverick)
*apparmor is disable.


apparmor shouldn't matter if you have libvirt 0.9.9 or newer.

I use libvirt 0.8.3 (latest version for maverick), so I disabled apparmor.



2012/2/2 Josh Durgin<josh.durgin@xxxxxxxxxxxxx>:
On 02/02/2012 01:49 AM, Masuko Tomoya wrote:

Hi, all.

When I execute "rbd" command, it is not success.

root@ceph01:~# rbd list
(no response)

/var/log/ceph/mon.0.log
-----
2012-02-02 17:58:19.801762 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.? 10.68.119.191:0/1002580 1 ==== auth(proto 0 30 bytes) v1 ====
56+0+0 (625540289 0 0) 0x1619a00 con 0x1615a00
2012-02-02 17:58:19.801919 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
0x1619c00 con 0x1615a00
2012-02-02 17:58:19.802505 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.? 10.68.119.191:0/1002580 2 ==== auth(proto 2 32 bytes) v1 ====
58+0+0 (346146289 0 0) 0x161fc00 con 0x1615a00
2012-02-02 17:58:19.802673 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
0x1619a00 con 0x1615a00
2012-02-02 17:58:19.803473 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.? 10.68.119.191:0/1002580 3 ==== auth(proto 2 165 bytes) v1
==== 191+0+0 (3737796417 0 0) 0x1619600 con 0x1615a00
2012-02-02 17:58:19.803745 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
0x161fc00 con 0x1615a00
2012-02-02 17:58:19.804425 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.? 10.68.119.191:0/1002580 4 ==== mon_subscribe({monmap=0+}) v2
==== 23+0+0 (1620593354 0 0) 0x1617380 con 0x1615a00
2012-02-02 17:58:19.804488 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1635700 con 0x1615a00
2012-02-02 17:58:19.804517 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
client.? 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 -- ?+0
0x163d780
2012-02-02 17:58:19.804550 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.4112 10.68.119.191:0/1002580 5 ====
mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0)
0x1617a80 con 0x1615a00
2012-02-02 17:58:19.804578 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617380 con 0x1615a00
2012-02-02 17:58:19.804656 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 --
?+0 0x1619600
2012-02-02 17:58:19.804744 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 --
?+0 0x163d900
2012-02-02 17:58:19.804778 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
client.4112 10.68.119.191:0/1002580 6 ====
mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0)
0x16178c0 con 0x1615a00
2012-02-02 17:58:19.804811 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617a80 con 0x1615a00
2012-02-02 17:58:19.804855 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 --
?+0 0x1619400
2012-02-02 17:58:19.804884 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 --
?+0 0x161d300
-----


No problems there.



BTW, I could execute "rados lspools".

root@ceph01:~# rados lspools
data
metadata
rbd


This might mean the rbd image list object can't be read for some
reason, or the rbd tool is doing something weird that the rados tool
isn't. Can you share the output of 'ceph -s' and
'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc 20
--debug-auth 20'?

You can run 'rados lspools' with those options as well and compare.


I would like to use rbd volume and attach it as virtual device for VM
guest on KVM.

Could you advice to me ?


My environment is below.

*ceph cluster is configured on single server.
*server is ubuntu 10.10 maverick.
*ceph, librados2 and librbd1 packages are installed.
  (version: 0.38-1maverick)
*apparmor is disable.


apparmor shouldn't matter if you have libvirt 0.9.9 or newer.


*root@ceph01:/# ls -l /etc/ceph
total 16
-rw-r--r-- 1 root root 340 2012-02-02 17:28 ceph.conf
-rw------- 1 root root  92 2012-02-02 17:28 client.admin.keyring
-rw------- 1 root root  85 2012-02-02 17:28 mds.0.keyring
-rw------- 1 root root  85 2012-02-02 17:28 osd.0.keyring
*/var/lib/ceph/tmp is exists.
root@ceph01:/var/log# ls -l /var/lib/ceph/
total 4
drwxrwxrwx 2 root root 4096 2011-11-11 09:28 tmp

*/etc/ceph/ceph.conf
[global]
         auth supported = cephx
         keyring = /etc/ceph/$name.keyring
[mon]
         mon data = /data/data/mon$id
         debug ms = 1
[mon.0]
         host = ceph01
         mon addr = 10.68.119.191:6789
[mds]

[mds.0]
         host = ceph01
[osd]
         osd data = /data/osd$id
         osd journal = /data/osd$id/journal
         osd journal size = 512
         osd class tmp = /var/lib/ceph/tmp
[osd.0]
         host = ceph01
         btrfs devs = /dev/sdb1


Waiting for your reply,

Tomoya.




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux