Re: directory hang which mount from a mapped rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



thanks for so fast reply.
output in one of the faulty host:

root@musicgci5:~#  ceph -s
  cluster 409059ba-797e-46da-bc2f-83e3c7779094
   health HEALTH_OK
   monmap e1: 3 mons at
{musicgci2=192.168.43.12:6789/0,musicgci3=192.168.43.13:6789/0,musicgci4=192.168.43.14:6789/0},
election epoch 70, quorum 0,1,2 musicgci2,musicgci3,musicgci4
   osdmap e32317: 69 osds: 69 up, 69 in
    pgmap v39521976: 18748 pgs: 18748 active+clean; 48326 GB data, 141
TB used, 46977 GB / 187 TB avail; 319KB/s wr, 2op/s
   mdsmap e1: 0/0/1 up

root@musicgci5:~#  find /sys/kernel/debug/ceph -type f -print -exec cat {} \;
/sys/kernel/debug/ceph/409059ba-797e-46da-bc2f-83e3c7779094.client400179/osdmap
epoch 32317
flags
pg_pool 0 pg_num 64 / 63, lpg_num 0 / 0
pg_pool 1 pg_num 64 / 63, lpg_num 0 / 0
pg_pool 2 pg_num 2048 / 2047, lpg_num 0 / 0
pg_pool 3 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 4 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 5 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 6 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 7 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 8 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 9 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 10 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 11 pg_num 1500 / 2047, lpg_num 0 / 0
pg_pool 13 pg_num 1024 / 1023, lpg_num 0 / 0
pg_pool 14 pg_num 1024 / 1023, lpg_num 0 / 0
pg_pool 15 pg_num 1024 / 1023, lpg_num 0 / 0
osd0 192.168.43.11:6808 100% (exists, up)
osd1 192.168.43.12:6808 100% (exists, up)
osd2 192.168.43.13:6828 100% (exists, up)
osd3 192.168.43.14:6804 100% (exists, up)
osd4 192.168.43.15:6805  0% (doesn't exist)
osd5 192.168.43.11:6816 100% (exists, up)
osd6 192.168.43.12:6816 100% (exists, up)
osd7 192.168.43.13:6808 100% (exists, up)
osd8 192.168.43.14:6816 100% (exists, up)
osd9 192.168.43.15:6800 100% (exists, up)
osd10 192.168.43.11:6832 100% (exists, up)
osd11 192.168.43.12:6800 100% (exists, up)
osd12 192.168.43.13:6800 100% (exists, up)
osd13 192.168.43.14:6836 100% (exists, up)
osd14 192.168.43.15:6809 100% (exists, up)
osd15 192.168.43.11:6828 100% (exists, up)
osd16 192.168.43.12:6820 100% (exists, up)
osd17 192.168.43.13:6832 100% (exists, up)
osd18 192.168.43.14:6800 100% (exists, up)
osd19 192.168.43.15:6810 100% (exists, up)
osd20 192.168.43.11:6804 100% (exists, up)
osd21 192.168.43.12:6804 100% (exists, up)
osd22 192.168.43.13:6816 100% (exists, up)
osd23 192.168.43.14:6812 100% (exists, up)
osd24 192.168.43.15:6852 100% (exists, up)
osd25 192.168.43.11:6836 100% (exists, up)
osd26 192.168.43.12:6812 100% (exists, up)
osd27 192.168.43.13:6824 100% (exists, up)
osd28 192.168.43.14:6832 100% (exists, up)
osd29 192.168.43.15:6836 100% (exists, up)
osd30 192.168.43.11:6812 100% (exists, up)
osd31 192.168.43.12:6824 100% (exists, up)
osd32 192.168.43.13:6812 83% (exists, up)
osd33 192.168.43.14:6808 100% (exists, up)
osd34 192.168.43.15:6801 89% (exists, up)
osd35 192.168.43.11:6820 100% (exists, up)
osd36 192.168.43.12:6832 100% (exists, up)
osd37 192.168.43.13:6836 79% (exists, up)
osd38 192.168.43.14:6828 100% (exists, up)
osd39 192.168.43.15:6818 86% (exists, up)
osd40 192.168.43.11:6800 83% (exists, up)
osd41 192.168.43.12:6836 100% (exists, up)
osd42 192.168.43.13:6820 100% (exists, up)
osd43 192.168.43.14:6824 100% (exists, up)
osd44 192.168.43.15:6823 100% (exists, up)
osd45 192.168.43.11:6824 100% (exists, up)
osd46 192.168.43.12:6828 100% (exists, up)
osd47 192.168.43.13:6804 100% (exists, up)
osd48 192.168.43.14:6820 100% (exists, up)
osd49 192.168.43.15:6805 100% (exists, up)
osd50 192.168.43.17:6832 86% (exists, up)
osd51 192.168.43.16:6820 100% (exists, up)
osd52 192.168.43.18:6820  0% (doesn't exist)
osd53 192.168.43.19:6808  0% (doesn't exist)
osd54 192.168.43.20:6816  0% (doesn't exist)
osd55 192.168.43.21:6836  0% (doesn't exist)
osd56 192.168.43.16:6828 100% (exists, up)
osd57 192.168.43.17:6808 100% (exists, up)
osd58 192.168.43.18:6832  0% (doesn't exist)
osd59 192.168.43.19:6800  0% (doesn't exist)
osd60 192.168.43.20:6824  0% (doesn't exist)
osd61 192.168.43.21:6816  0% (doesn't exist)
osd62 192.168.43.16:6812 100% (exists, up)
osd63 192.168.43.17:6816 100% (exists, up)
osd64 192.168.43.18:6800  0% (doesn't exist)
osd65 192.168.43.19:6828  0% (doesn't exist)
osd66 192.168.43.20:6808  0% (doesn't exist)
osd67 192.168.43.21:6832  0% (doesn't exist)
osd68 192.168.43.16:6808 100% (exists, up)
osd69 192.168.43.17:6847 100% (exists, up)
osd70 192.168.43.18:6812  0% (doesn't exist)
osd71 192.168.43.19:6824  0% (doesn't exist)
osd72 192.168.43.20:6804  0% (doesn't exist)
osd73 192.168.43.21:6800  0% (doesn't exist)
osd74 192.168.43.16:6806 100% (exists, up)
osd75 192.168.43.17:6843 100% (exists, up)
osd76 192.168.43.18:6801  0% (doesn't exist)
osd77 192.168.43.19:6812  0% (doesn't exist)
osd78 192.168.43.20:6820  0% (doesn't exist)
osd79 192.168.43.21:6808  0% (doesn't exist)
osd80 192.168.43.16:6816 100% (exists, up)
osd81 192.168.43.17:6851 100% (exists, up)
osd82 192.168.43.18:6836  0% (doesn't exist)
osd83 192.168.43.19:6832  0% (doesn't exist)
osd84 192.168.43.20:6836  0% (doesn't exist)
osd85 192.168.43.21:6812  0% (doesn't exist)
osd86 192.168.43.16:6832 86% (exists, up)
osd87 192.168.43.17:6824 100% (exists, up)
osd88 192.168.43.18:6824  0% (doesn't exist)
osd89 192.168.43.19:6816  0% (doesn't exist)
osd90 192.168.43.20:6800  0% (doesn't exist)
osd91 192.168.43.21:6828  0% (doesn't exist)
osd92 192.168.43.16:6801 100% (exists, up)
osd93 192.168.43.17:6828 100% (exists, up)
osd94 192.168.43.18:6804  0% (doesn't exist)
osd95 192.168.43.19:6820  0% (doesn't exist)
osd96 192.168.43.20:6828  0% (doesn't exist)
osd97 192.168.43.21:6804  0% (doesn't exist)
osd98 192.168.43.16:6800 100% (exists, up)
osd99 192.168.43.17:6801 100% (exists, up)
osd100 192.168.43.18:6828  0% (doesn't exist)
osd101 192.168.43.19:6836  0% (doesn't exist)
osd102 192.168.43.20:6832  0% (doesn't exist)
osd103 192.168.43.21:6820  0% (doesn't exist)
osd104 192.168.43.16:6802 100% (exists, up)
osd105 192.168.43.17:6820 100% (exists, up)
osd106 192.168.43.18:6816  0% (doesn't exist)
osd107 192.168.43.19:6804  0% (doesn't exist)
osd108 192.168.43.20:6812  0% (doesn't exist)
osd109 192.168.43.21:6824  0% (doesn't exist)
/sys/kernel/debug/ceph/409059ba-797e-46da-bc2f-83e3c7779094.client400179/monmap
epoch 1
mon0 192.168.43.12:6789
mon1 192.168.43.13:6789
mon2 192.168.43.14:6789
/sys/kernel/debug/ceph/409059ba-797e-46da-bc2f-83e3c7779094.client400179/osdc
155198393 osd28 2.2e56 rb.0.9e5ab.6b8b4567.00000001f432 write
155198396 osd38 2.2d68 rb.0.5b06d.6b8b4567.00000001f433 write
155198529 osd87 2.27d6 rb.0.578cc.6b8b4567.0000000c0021 write
155198530 osd80 2.33e rb.0.578c3.6b8b4567.0000000c0021 write
155198531 osd16 2.79ce rb.0.5486a.6b8b4567.000000006421 write
155198532 osd22 2.b35e rb.0.899f7.6b8b4567.00000000322f write
155198533 osd26 2.ea40 rb.0.2b68d4.6b8b4567.000000040022 write
155198534 osd20 2.713d rb.0.578d5.6b8b4567.0000000c0021 write
155198535 osd26 2.e436 rb.0.54935.6b8b4567.000000006421 write
155198536 osd56 2.cc9d rb.0.56fb6.6b8b4567.00000001f421 write
155198537 osd80 2.936b rb.0.5486d.6b8b4567.000000006421 write
155198539 osd51 2.d1bd rb.0.51ae2.6b8b4567.000000018623 write
155198586 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
155198587 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
155198597 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
155198598 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
155199106 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
155199460 osd40 2.f093 rb.0.899f7.6b8b4567.000000000470 write
/sys/kernel/debug/ceph/409059ba-797e-46da-bc2f-83e3c7779094.client400179/monc
have osdmap 32317
want next osdmap

2016-04-15 16:27 GMT+08:00 Ilya Dryomov <idryomov@xxxxxxxxx>:
> On Fri, Apr 15, 2016 at 10:18 AM, lin zhou <hnuzhoulin2@xxxxxxxxx> wrote:
>> Hi,cephers:
>> In one of my ceph cluster,we map rbd then mount it. in node1 and then
>> using samba to share it to do backup for several vm,and some web root
>> directory.
>>
>> Yesterday,one of the disk  in my cluster is full at 95%,then the
>> cluster stop receive write request.
>> I have solve the full problem.But these mount directory from mapped
>> rbd in node1 do not work correctly.
>>
>> case 1:I can access some directory in node1 ,but some files can not open
>> case 2:If I enter a directory,the cd command hang,the state in ps is
>> D,and can not be killed.If I map this rbd in another host and then
>> mount it,I can see all files.
>>
>> so does some option for rbd map and the following mount command to
>> deal with this situation that ceph has a little volatility to prevent
>> hang.
>
> Can you provide the output of
>
> # ceph -s
> # find /sys/kernel/debug/ceph -type f -print -exec cat {} \;
>
> from the faulty host?
>
> Thanks,
>
>                 Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux