Re: couldn't use rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I upgraded ceph to 0.41 and re-running mkcephfs.
I found my issue is fixed.

-----
root@ceph01:~# rbd list
pool rbd doesn't contain rbd images
root@ceph01:~# rbd create test --size 1024
root@ceph01:~# rbd list
test
-----

Josh, thank you for your advices.


2012/2/3 Josh Durgin <josh.durgin@xxxxxxxxxxxxx>:
> On 02/03/2012 02:54 PM, Masuko Tomoya wrote:
>>
>> Hi,
>>
>> The output of 'ceph pg dump' is below.
>>
>> root@ceph01:~# ceph pg dump
>> 2012-02-04 07:50:15.453151 mon<- [pg,dump]
>> 2012-02-04 07:50:15.453734 mon.0 ->  'dumped all in format plain' (0)
>> version 63
>> last_osdmap_epoch 37
>> last_pg_scan 1
>> full_ratio 0.95
>> nearfull_ratio 0.85
>> pg_stat objects mip     degr    unf     kb      bytes   log
>> disklog state   v       reported        up      acting  last_scrub
>> 1.1p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'30   [0]     [0] 0'0
>> 2012-02-03 10:25:55.383343
>> 0.0p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'32   [0]     [0] 0'0
>> 2012-02-03 10:25:51.380648
>> 1.0p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'30   [0]     [0] 0'0
>> 2012-02-03 10:25:53.381291
>> 0.1p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'32   [0]     [0] 0'0
>> 2012-02-03 10:25:52.380881
>> 2.0p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'30   [0]     [0] 0'0
>> 2012-02-03 10:25:59.387441
>> 2.1p0   0       0       0       0       0       0       0       0
>>  active+clean+degraded   0'0     34'30   [0]     [0] 0'0
>> 2012-02-03 10:26:04.392778
>> pool 0  0       0       0       0       0       0       0       0
>> pool 1  0       0       0       0       0       0       0       0
>> pool 2  0       0       0       0       0       0       0       0
>>  sum    0       0       0       0       0       0       0       0
>> osdstat kbused  kbavail kb      hb in   hb out
>> 0       1568    15726032        15727600        []      []
>>  sum    1568    15726032        15727600
>>
>
> You hit a bug in 0.38 that made the default crushmap for one osd
> contain no pgs. This was fixed in 0.39, so I'd suggest upgrading and
> re-running mkcephfs.
>
>
>>
>> 2012/2/3 Josh Durgin<josh.durgin@xxxxxxxxxxxxx>:
>>>
>>> On 02/03/2012 02:14 PM, Masuko Tomoya wrote:
>>>>
>>>>
>>>> Hi Josh,
>>>>
>>>> Thank you for your comments.
>>>>
>>>>>    debug osd = 20
>>>>>    debug ms = 1
>>>>>    debug filestore = 20
>>>>
>>>>
>>>>
>>>> I added this to the osd section of ceph.conf and ran /etc/init.d/ceph
>>>> stop&start.
>>>>
>>>> The output of OSD.log when 'rbd list' was executed is below.
>>>>
>>>> -----
>>>> 2012-02-04 04:29:22.457990 7fe0e08fb710 osd.0 32
>>>> OSD::ms_verify_authorizer name=client.admin auid=0
>>>> 2012-02-04 04:29:22.458041 7fe0e08fb710 osd.0 32  new session
>>>> 0x24f5240 con=0x24d4dc0 addr=10.68.119.191:0/1005110
>>>> 2012-02-04 04:29:22.458069 7fe0e08fb710 osd.0 32  session 0x24f5240
>>>> has caps osdcaps(pools={} default allow= default_deny=)
>>>> 2012-02-04 04:29:22.458415 7fe0e6c0a710 -- 10.68.119.191:6801/4992<==
>>>> client.4201 10.68.119.191:0/1005110 1 ==== osd_op(client.4201.0:1
>>>> rbd_directory [read 0~0] 2.30a98c1c) v3 ==== 143+0+0 (3720164172 0 0)
>>>> 0x24d8900 con 0x24d4dc0
>>>> 2012-02-04 04:29:22.458442 7fe0e6c0a710 osd.0 32 _dispatch 0x24d8900
>>>> osd_op(client.4201.0:1 rbd_directory [read 0~0] 2.30a98c1c) v3
>>>> 2012-02-04 04:29:22.458463 7fe0e6c0a710 osd.0 32
>>>> require_same_or_newer_map 32 (i am 32) 0x24d8900
>>>> 2012-02-04 04:29:22.458487 7fe0e6c0a710 osd.0 32 _share_map_incoming
>>>> client.4201 10.68.119.191:0/1005110 32
>>>> 2012-02-04 04:29:22.458507 7fe0e6c0a710 osd.0 32 hit non-existent pg
>>>> 2.0, waiting
>>>
>>>
>>>
>>> The pg should have been created already. What's the output of 'ceph pg
>>> dump'?
>>>
>>>
>>>>
>>>>
>>>> 2012/2/3 Josh Durgin<josh.durgin@xxxxxxxxxxxxx>:
>>>>>
>>>>>
>>>>> On 02/03/2012 12:51 AM, Masuko Tomoya wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Josh,
>>>>>>
>>>>>> Thank you for reply !
>>>>>>
>>>>>>> This might mean the rbd image list object can't be read for some
>>>>>>> reason, or the rbd tool is doing something weird that the rados tool
>>>>>>> isn't. Can you share the output of 'ceph -s' and
>>>>>>> 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc
>>>>>>> 20
>>>>>>> --debug-auth 20'?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The output of 'ceph -s' is below.
>>>>>> -------
>>>>>> root@ceph01:~# ceph -s
>>>>>> 2012-02-03 17:01:47.881960    pg v33: 6 pgs: 6 active+clean+degraded;
>>>>>> 0 KB data, 1056 KB used, 15357 MB / 15358 MB avail
>>>>>> 2012-02-03 17:01:47.882583   mds e9: 1/1/1 up {0=0=up:creating}
>>>>>> 2012-02-03 17:01:47.882733   osd e21: 1 osds: 1 up, 1 in
>>>>>> 2012-02-03 17:01:47.883042   log 2012-02-03 16:35:11.183897 osd.0
>>>>>> 10.68.119.191:6801/2912 12 : [WRN] map e19 wrongly marked me down or
>>>>>> wrong addr
>>>>>> 2012-02-03 17:01:47.883144   mon e1: 1 mons at
>>>>>> {0=10.68.119.191:6789/0}
>>>>>>
>>>>>> The output of 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20
>>>>>> --debug-monc 20 --debug-auth 20' is below.
>>>>>> ------
>>>>>> root@ceph01:~# rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20
>>>>>> --debug-monc 20 --debug-auth 20
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <snip>
>>>>>
>>>>>
>>>>>> 2012-02-03 17:02:10.971391 7f88cbb91720 client.4106.objecter op_submit
>>>>>> oid rbd_directory @2 [read 0~0] tid 1 osd.0
>>>>>> 2012-02-03 17:02:10.971465 7f88cbb91720 client.4106.objecter send_op 1
>>>>>> to
>>>>>> osd.0
>>>>>> 2012-02-03 17:02:10.971533 7f88cbb91720 -- 10.68.119.191:0/1003500 -->
>>>>>> 10.68.119.191:6801/2912 -- osd_op(client.4106.0:1 rbd_directory [read
>>>>>> 0~0] 2.30a98c1c) v1 -- ?+0 0x24664c0 con 0x24661b0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Everything above here is normal - the rbd tool connected to the
>>>>> monitors, got the monmap and osdmap, and sent a request to read the
>>>>> 'rbd_directory' object.
>>>>>
>>>>> <snip>
>>>>>
>>>>>
>>>>>> 2012-02-03 17:02:25.969338 7f88c7261710 client.4106.objecter  tid 1 on
>>>>>> osd.0 is laggy
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> This means the osd isn't responding to the read. Check the osd log for
>>>>> errors. If there's nothing obvious, add this to the osd section of your
>>>>> ceph.conf and restart the osd:
>>>>>
>>>>>    debug osd = 20
>>>>>    debug ms = 1
>>>>>    debug filestore = 20
>>>>>
>>>>> Then run 'rbd ls' and look at what happens after
>>>>> 'osd_op.*rbd_directory' appears in the osd log.
>>>>>
>>>>> <rados lspools log>
>>>>>
>>>>>
>>>>>>
>>>>>> I compared those logs and found there is differences.
>>>>>> 'rbd list'
>>>>>> 2012-02-03 17:02:10.971770 7f88c9366710 -- 10.68.119.191:0/1003500<==
>>>>>> mon.0 10.68.119.191:6789/0 10 ==== osd_map(21..21 src has 1..21) v2
>>>>>> ==== 1284+0+0 (473305567 0 0) 0x24655a0 con 0x24603e0
>>>>>> 2012-02-03 17:02:10.971789 7f88c9366710 client.4106.objecter
>>>>>> handle_osd_map ignoring epochs [21,21]<= 21
>>>>>> 2012-02-03 17:02:10.971801 7f88c9366710 client.4106.objecter
>>>>>> dump_active .. 0 homeless
>>>>>> 2012-02-03 17:02:10.971815 7f88c9366710 client.4106.objecter 1
>>>>>> 2.30a98c1c      osd.0   rbd_directory   [read 0~0]
>>>>>> --(snip)--
>>>>>>
>>>>>> 'rados lspools'
>>>>>> 2012-02-03 17:11:52.866072 7f9c5764b710 -- 10.68.119.191:0/1003868<==
>>>>>> mon.0 10.68.119.191:6789/0 7 ==== osd_map(21..21 src has 1..21) v2
>>>>>> ==== 1284+0+0 (473305567 0 0) 0x770a70 con 0x771440
>>>>>> 2012-02-03 17:11:52.866103 7f9c5764b710 client.4107.objecter
>>>>>> handle_osd_map got epochs [21,21]>      0
>>>>>> 2012-02-03 17:11:52.866111 7f9c5764b710 client.4107.objecter
>>>>>> handle_osd_map decoding full epoch 21
>>>>>> 2012-02-03 17:11:52.866272 7f9c5764b710 client.4107.objecter
>>>>>> dump_active .. 0 homeless
>>>>>> data
>>>>>> metadata
>>>>>> rbd
>>>>>> --(snip)--
>>>>>>
>>>>>> What do these logs mean ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The difference is that 'rbd ls' talks to the monitors and osds, while
>>>>> 'rados
>>>>> lspools' just needs to talk to the monitors. The objecter dump_active
>>>>> part
>>>>> is listing in-flight osd requests.
>>>>>
>>>>>
>>>>>>
>>>>>>>> *ceph cluster is configured on single server.
>>>>>>>> *server is ubuntu 10.10 maverick.
>>>>>>>> *ceph, librados2 and librbd1 packages are installed.
>>>>>>>>  (version: 0.38-1maverick)
>>>>>>>> *apparmor is disable.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> apparmor shouldn't matter if you have libvirt 0.9.9 or newer.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I use libvirt 0.8.3 (latest version for maverick), so I disabled
>>>>>> apparmor.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2012/2/2 Josh Durgin<josh.durgin@xxxxxxxxxxxxx>:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 02/02/2012 01:49 AM, Masuko Tomoya wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi, all.
>>>>>>>>
>>>>>>>> When I execute "rbd" command, it is not success.
>>>>>>>>
>>>>>>>> root@ceph01:~# rbd list
>>>>>>>> (no response)
>>>>>>>>
>>>>>>>> /var/log/ceph/mon.0.log
>>>>>>>> -----
>>>>>>>> 2012-02-02 17:58:19.801762 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.? 10.68.119.191:0/1002580 1 ==== auth(proto 0 30 bytes) v1
>>>>>>>> ====
>>>>>>>> 56+0+0 (625540289 0 0) 0x1619a00 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.801919 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
>>>>>>>> 0x1619c00 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.802505 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.? 10.68.119.191:0/1002580 2 ==== auth(proto 2 32 bytes) v1
>>>>>>>> ====
>>>>>>>> 58+0+0 (346146289 0 0) 0x161fc00 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.802673 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
>>>>>>>> 0x1619a00 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.803473 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.? 10.68.119.191:0/1002580 3 ==== auth(proto 2 165 bytes) v1
>>>>>>>> ==== 191+0+0 (3737796417 0 0) 0x1619600 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.803745 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0
>>>>>>>> 0x161fc00 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804425 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.? 10.68.119.191:0/1002580 4 ==== mon_subscribe({monmap=0+})
>>>>>>>> v2
>>>>>>>> ==== 23+0+0 (1620593354 0 0) 0x1617380 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804488 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1635700 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804517 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> client.? 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 --
>>>>>>>> ?+0
>>>>>>>> 0x163d780
>>>>>>>> 2012-02-02 17:58:19.804550 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.4112 10.68.119.191:0/1002580 5 ====
>>>>>>>> mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0)
>>>>>>>> 0x1617a80 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804578 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617380 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804656 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 --
>>>>>>>> ?+0 0x1619600
>>>>>>>> 2012-02-02 17:58:19.804744 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 --
>>>>>>>> ?+0 0x163d900
>>>>>>>> 2012-02-02 17:58:19.804778 7ff4bbfb1710 -- 10.68.119.191:6789/0<==
>>>>>>>> client.4112 10.68.119.191:0/1002580 6 ====
>>>>>>>> mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0)
>>>>>>>> 0x16178c0 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804811 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617a80 con 0x1615a00
>>>>>>>> 2012-02-02 17:58:19.804855 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 --
>>>>>>>> ?+0 0x1619400
>>>>>>>> 2012-02-02 17:58:19.804884 7ff4bbfb1710 -- 10.68.119.191:6789/0 -->
>>>>>>>> client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 --
>>>>>>>> ?+0 0x161d300
>>>>>>>> -----
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> No problems there.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> BTW, I could execute "rados lspools".
>>>>>>>>
>>>>>>>> root@ceph01:~# rados lspools
>>>>>>>> data
>>>>>>>> metadata
>>>>>>>> rbd
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This might mean the rbd image list object can't be read for some
>>>>>>> reason, or the rbd tool is doing something weird that the rados tool
>>>>>>> isn't. Can you share the output of 'ceph -s' and
>>>>>>> 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc
>>>>>>> 20
>>>>>>> --debug-auth 20'?
>>>>>>>
>>>>>>> You can run 'rados lspools' with those options as well and compare.
>>>>>>>
>>>>>>>
>>>>>>>> I would like to use rbd volume and attach it as virtual device for
>>>>>>>> VM
>>>>>>>> guest on KVM.
>>>>>>>>
>>>>>>>> Could you advice to me ?
>>>>>>>>
>>>>>>>>
>>>>>>>> My environment is below.
>>>>>>>>
>>>>>>>> *ceph cluster is configured on single server.
>>>>>>>> *server is ubuntu 10.10 maverick.
>>>>>>>> *ceph, librados2 and librbd1 packages are installed.
>>>>>>>>  (version: 0.38-1maverick)
>>>>>>>> *apparmor is disable.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> apparmor shouldn't matter if you have libvirt 0.9.9 or newer.
>>>>>>>
>>>>>>>
>>>>>>>> *root@ceph01:/# ls -l /etc/ceph
>>>>>>>> total 16
>>>>>>>> -rw-r--r-- 1 root root 340 2012-02-02 17:28 ceph.conf
>>>>>>>> -rw------- 1 root root  92 2012-02-02 17:28 client.admin.keyring
>>>>>>>> -rw------- 1 root root  85 2012-02-02 17:28 mds.0.keyring
>>>>>>>> -rw------- 1 root root  85 2012-02-02 17:28 osd.0.keyring
>>>>>>>> */var/lib/ceph/tmp is exists.
>>>>>>>> root@ceph01:/var/log# ls -l /var/lib/ceph/
>>>>>>>> total 4
>>>>>>>> drwxrwxrwx 2 root root 4096 2011-11-11 09:28 tmp
>>>>>>>>
>>>>>>>> */etc/ceph/ceph.conf
>>>>>>>> [global]
>>>>>>>>         auth supported = cephx
>>>>>>>>         keyring = /etc/ceph/$name.keyring
>>>>>>>> [mon]
>>>>>>>>         mon data = /data/data/mon$id
>>>>>>>>         debug ms = 1
>>>>>>>> [mon.0]
>>>>>>>>         host = ceph01
>>>>>>>>         mon addr = 10.68.119.191:6789
>>>>>>>> [mds]
>>>>>>>>
>>>>>>>> [mds.0]
>>>>>>>>         host = ceph01
>>>>>>>> [osd]
>>>>>>>>         osd data = /data/osd$id
>>>>>>>>         osd journal = /data/osd$id/journal
>>>>>>>>         osd journal size = 512
>>>>>>>>         osd class tmp = /var/lib/ceph/tmp
>>>>>>>> [osd.0]
>>>>>>>>         host = ceph01
>>>>>>>>         btrfs devs = /dev/sdb1
>>>>>>>>
>>>>>>>>
>>>>>>>> Waiting for your reply,
>>>>>>>>
>>>>>>>> Tomoya.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux