Re: "rbd ls -l" hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 1, 2013 at 9:57 AM, Jeff Moskow <jeff@xxxxxxx> wrote:
> Greg,
>
>     Thanks for the hints.  I looked through the logs and found OSD's with
> RETRY's.  I marked those "out" (marked in orange) and let ceph rebalance.
> Then I ran the bench command.
> I now have many more errors than before :-(.
>
> health HEALTH_WARN 1 pgs incomplete; 1 pgs stuck inactive; 151 pgs stuck
> unclean
>
> Note that the incomplete pg is still the same (2.1f6).
>
> Any ideas on what to try next?
>
> 2013-08-01 12:39:38.349011 osd.4 172.16.170.2:6801/1778 1154 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 18.085318 sec at 57979 KB/sec
> 2013-08-01 12:39:38.499002 osd.5 172.16.170.2:6802/19375 454 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 18.232358 sec at 57511 KB/sec
> 2013-08-01 12:39:44.077347 osd.3 172.16.170.2:6800/1647 1211 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 23.813801 sec at 44032 KB/sec
> 2013-08-01 12:39:49.118812 osd.16 172.16.170.4:6802/1837 746 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 28.453320 sec at 36852 KB/sec
> 2013-08-01 12:39:48.468020 osd.15 172.16.170.4:6801/1699 821 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 27.802566 sec at 37715 KB/sec
> 2013-08-01 12:39:54.369364 osd.0 172.16.170.1:6800/3783 948 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 34.076451 sec at 30771 KB/sec
> 2013-08-01 12:39:48.618080 osd.14 172.16.170.4:6800/1572 16161 : [INF]
> bench: wrote 1024 MB in blocks of 4096 KB in 27.952574 sec at 37512 KB/sec
> 2013-08-01 12:39:54.382830 osd.2 172.16.170.1:6803/22033 222 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 34.090170 sec at 30758 KB/sec
> 2013-08-01 12:40:03.458096 osd.6 172.16.170.3:6801/1738 1582 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 43.143180 sec at 24304 KB/sec
> 2013-08-01 12:40:03.724504 osd.10 172.16.170.3:6800/1473 1238 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 43.409558 sec at 24155 KB/sec
> 2013-08-01 12:40:02.426650 osd.8 172.16.170.3:6803/2013 8272 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 42.111713 sec at 24899 KB/sec
> 2013-08-01 12:40:02.997093 osd.7 172.16.170.3:6802/1864 1094 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 42.682079 sec at 24567 KB/sec
> 2013-08-01 12:40:02.867046 osd.9 172.16.170.3:6804/2149 2258 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 42.551771 sec at 24642 KB/sec
> 2013-08-01 12:39:54.360014 osd.1 172.16.170.1:6801/4243 3060 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 34.070725 sec at 30776 KB/sec
> 2013-08-01 12:42:56.984632 osd.11 172.16.170.5:6800/28025 43996 : [INF]
> bench: wrote 1024 MB in blocks of 4096 KB in 216.687559 sec at 4839 KB/sec
> 2013-08-01 12:43:21.271481 osd.13 172.16.170.5:6802/1872 1056 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 240.974360 sec at 4351 KB/sec
> 2013-08-01 12:43:39.320462 osd.12 172.16.170.5:6801/1700 1348 : [INF] bench:
> wrote 1024 MB in blocks of 4096 KB in 259.023646 sec at 4048 KB/sec

Sorry for the slow reply; I've been out on vacation. :)
Looking through this list, I'm noticing that many of your OSDs are
reporting 4MB/s write speeds and they don't correspond to the ones you
marked out (though if your cluster was somehow under load that could
have something to do with the very different speed reports).

You still want to look at the pg statistics for the stuck PG; I'm not
seeing that anywhere?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux