Hi, Indeed strange. That output was when we had issues, seems that most operations were blocked / slow requests. A ?baseline? output is more like today: 2014-09-05 10:44:29.123681 mon.0 [INF] pgmap v12582759: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9273KB/s rd, 24650KB/s wr, 2755op/s 2014-09-05 10:44:30.125637 mon.0 [INF] pgmap v12582760: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9500KB/s rd, 20430KB/s wr, 2294op/s 2014-09-05 10:44:31.139427 mon.0 [INF] pgmap v12582761: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9216KB/s rd, 20062KB/s wr, 2488op/s 2014-09-05 10:44:32.144945 mon.0 [INF] pgmap v12582762: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 12511KB/s rd, 15739KB/s wr, 2488op/s 2014-09-05 10:44:33.161210 mon.0 [INF] pgmap v12582763: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 18593KB/s rd, 14880KB/s wr, 2609op/s 2014-09-05 10:44:34.187294 mon.0 [INF] pgmap v12582764: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 17720KB/s rd, 22964KB/s wr, 3257op/s 2014-09-05 10:44:35.190785 mon.0 [INF] pgmap v12582765: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 19230KB/s rd, 18901KB/s wr, 3199op/s 2014-09-05 10:44:36.213535 mon.0 [INF] pgmap v12582766: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 17630KB/s rd, 18855KB/s wr, 3131op/s 2014-09-05 10:44:37.220052 mon.0 [INF] pgmap v12582767: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 12262KB/s rd, 18627KB/s wr, 2595op/s 2014-09-05 10:44:38.233357 mon.0 [INF] pgmap v12582768: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 17697KB/s rd, 17572KB/s wr, 2156op/s 2014-09-05 10:44:39.239409 mon.0 [INF] pgmap v12582769: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 20300KB/s rd, 19735KB/s wr, 2197op/s 2014-09-05 10:44:40.260423 mon.0 [INF] pgmap v12582770: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 14656KB/s rd, 15460KB/s wr, 2199op/s 2014-09-05 10:44:41.269736 mon.0 [INF] pgmap v12582771: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 8969KB/s rd, 11918KB/s wr, 1951op/s 2014-09-05 10:44:42.276192 mon.0 [INF] pgmap v12582772: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 7272KB/s rd, 10644KB/s wr, 1832op/s 2014-09-05 10:44:43.291817 mon.0 [INF] pgmap v12582773: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9316KB/s rd, 16610KB/s wr, 2412op/s 2014-09-05 10:44:44.295469 mon.0 [INF] pgmap v12582774: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9257KB/s rd, 19953KB/s wr, 2633op/s 2014-09-05 10:44:45.315774 mon.0 [INF] pgmap v12582775: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9718KB/s rd, 14298KB/s wr, 2101op/s 2014-09-05 10:44:46.326783 mon.0 [INF] pgmap v12582776: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 20877KB/s rd, 12822KB/s wr, 2447op/s 2014-09-05 10:44:47.327537 mon.0 [INF] pgmap v12582777: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 18447KB/s rd, 12945KB/s wr, 2226op/s 2014-09-05 10:44:48.348725 mon.0 [INF] pgmap v12582778: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 15514KB/s rd, 19366KB/s wr, 1892op/s 2014-09-05 10:44:49.357537 mon.0 [INF] pgmap v12582779: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 16640KB/s rd, 19694KB/s wr, 1887op/s 2014-09-05 10:44:50.365016 mon.0 [INF] pgmap v12582780: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9139KB/s rd, 10734KB/s wr, 1566op/s 2014-09-05 10:44:51.385553 mon.0 [INF] pgmap v12582781: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 11493KB/s rd, 13827KB/s wr, 1928op/s 2014-09-05 10:44:52.389189 mon.0 [INF] pgmap v12582782: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 12922KB/s rd, 12814KB/s wr, 1670op/s 2014-09-05 10:44:53.403735 mon.0 [INF] pgmap v12582783: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 14880KB/s rd, 17532KB/s wr, 2145op/s 2014-09-05 10:44:54.410581 mon.0 [INF] pgmap v12582784: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 15419KB/s rd, 19696KB/s wr, 2259op/s 2014-09-05 10:44:55.420308 mon.0 [INF] pgmap v12582785: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 10964KB/s rd, 15739KB/s wr, 1701op/s 2014-09-05 10:44:56.445017 mon.0 [INF] pgmap v12582786: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 16195KB/s rd, 15554KB/s wr, 2444op/s 2014-09-05 10:44:57.452136 mon.0 [INF] pgmap v12582787: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 15016KB/s rd, 16356KB/s wr, 2358op/s 2014-09-05 10:44:58.465958 mon.0 [INF] pgmap v12582788: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 11668KB/s rd, 18443KB/s wr, 2029op/s 2014-09-05 10:44:59.483462 mon.0 [INF] pgmap v12582789: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 12253KB/s rd, 10846KB/s wr, 1529op/s 2014-09-05 10:45:00.492322 mon.0 [INF] pgmap v12582790: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 12247KB/s rd, 7084KB/s wr, 1464op/s 2014-09-05 10:45:01.516581 mon.0 [INF] pgmap v12582791: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 16460KB/s rd, 12089KB/s wr, 2537op/s 2014-09-05 10:45:02.527110 mon.0 [INF] pgmap v12582792: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 13382KB/s rd, 15080KB/s wr, 2563op/s 2014-09-05 10:45:03.538090 mon.0 [INF] pgmap v12582793: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 10902KB/s rd, 18745KB/s wr, 2863op/s 2014-09-05 10:45:04.558261 mon.0 [INF] pgmap v12582794: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 10850KB/s rd, 15995KB/s wr, 2695op/s 2014-09-05 10:45:05.565750 mon.0 [INF] pgmap v12582795: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 9636KB/s rd, 13262KB/s wr, 2372op/s 2014-09-05 10:45:06.593984 mon.0 [INF] pgmap v12582796: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 18808KB/s rd, 19329KB/s wr, 3819op/s 2014-09-05 10:45:07.595866 mon.0 [INF] pgmap v12582797: 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail; 21265KB/s rd, 20743KB/s wr, 3861op/s 2014-09-05 10:45:08.624949 mon.0 [INF] pgmap v12582798: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 20114KB/s rd, 18543KB/s wr, 3248op/s 2014-09-05 10:45:09.627901 mon.0 [INF] pgmap v12582799: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 14717KB/s rd, 15141KB/s wr, 2302op/s 2014-09-05 10:45:10.643234 mon.0 [INF] pgmap v12582800: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 8328KB/s rd, 13950KB/s wr, 1919op/s 2014-09-05 10:45:11.651602 mon.0 [INF] pgmap v12582801: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 16978KB/s rd, 15921KB/s wr, 3377op/s 2014-09-05 10:45:12.674819 mon.0 [INF] pgmap v12582802: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 16471KB/s rd, 14034KB/s wr, 3379op/s 2014-09-05 10:45:13.688080 mon.0 [INF] pgmap v12582803: 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail; 16149KB/s rd, 12657KB/s wr, 2734op/s Aye, we actually saw latency on the disks go up a bit when we had 128GB of RAM on the OSDs and decided to beef them up to 256GB which helped. They?re running different workloads (shared hosting) but we?ve never encountered the issue we had yesterday even during our testing/benchmarking. Kind Regards, David 5 sep 2014 kl. 09:05 skrev Christian Balzer <chibi at gol.com>: > > Hello, > > On Fri, 5 Sep 2014 08:26:47 +0200 David wrote: > >> Hi, >> >> Sorry for the lack of information yesterday, this was "solved" after >> some 30 minutes, after having reloaded/restarted all osd daemons. >> Unfortunately we couldn?t pin point it to a single OSD or drive, all >> drives seemed ok, some had a bit higher latency and we tried to out / in >> them to see if it had any effect which it did not. >> > This is odd. > Having it "fixed" by restarting all OSDs would suggest either a software > problem (bug) with Ceph or some resource other than the storage system > being starved. But memory seems unlikely, even with bloated, leaking OSD > daemon. And CPU seems even less likely. > >> The cluster consists of 3 mon servrers, 5 OSD servers with 10 enterprise >> HDDs backed with 2 S3700 SSDs for journals each. OSD servers have 256GB >> of RAM, 2x E5-2630 v2 @ 2.60GHz CPUs. >> >> The log that I posted yesterday was just a small taste of the full >> one ;) They were all pointing to different osd?s that they were waiting >> for. We?re also monitoring all of the VMs running on KVM, and we didn?t >> see any exceptional throughput or iops usage before or during this >> event. We were checking iostat etc and nothing was out of the ordinary.. >> >> Going to double check SMART and also see if we can off load some of the >> cluster in any way. If you have any other advice that?d be appreciated :) >> > If the mon output below is typical for your cluster (634KB/s rd, 487KB/s > wr, 90op/s) it should be able to handle that w/o breaking a sweat. > I guestimate it at about 13K write IOPS and your reads should (after a > while) come mostly from the nice big pagecache you have on those 256GB RAM > machines. > > Christian > >> Thanks for your help! >> >> Kind Regards, >> David >> >> 5 sep 2014 kl. 07:30 skrev Martin B Nielsen <martin at unity3d.com>: >> >>> Just echoing what Christian said. >>> >>> Also, iirc the "currently waiting for subobs on [" could also mean a >>> problem on those as it waits for ack from them (I might remember >>> wrong). >>> >>> If that is the case you might want to check in on osd 13 & 37 as well. >>> >>> With the cluster load and size you should not have this problem; I'm >>> pretty sure you're dealing with a rogue/faulty osd/node somewhere. >>> >>> Cheers, >>> Martin >>> >>> >>> On Fri, Sep 5, 2014 at 2:28 AM, Christian Balzer <chibi at gol.com> wrote: >>> On Thu, 4 Sep 2014 12:02:13 +0200 David wrote: >>> >>>> Hi, >>>> >>>> We?re running a ceph cluster with version: >>>> >>>> 0.67.7-1~bpo70+1 >>>> >>>> All of a sudden we?re having issues with the cluster (running RBD >>>> images for kvm) with slow requests on all of the OSD servers. Any >>>> idea why and how to fix it? >>>> >>> You give us a Ceph version at least, but for anybody to make guesses we >>> need much more information than a log spew. >>> >>> How many nodes/OSDs, OS, hardware, OSD details (FS, journals on SSDs), >>> etc. >>> >>> Run atop (in a sufficiently large terminal) on all your nodes, see if >>> you can spot a bottleneck, like a disk being at 100% all the time with >>> a much higher avio than the others. >>> Looking at your logs, I'd pay particular attention to the disk holding >>> osd.22. >>> A single slow disk can bring a whole large cluster to a crawl. >>> If you're using a hardware controller with a battery backed up cache, >>> check if that is fine, loss of the battery would switch from writeback >>> to writethrough and massively slow down IOPS. >>> >>> Regards, >>> >>> Christian >>>> >>>> 2014-09-04 11:56:35.868521 mon.0 [INF] pgmap v12504451: 6860 pgs: >>>> 6860 active+clean; 12163 GB data, 36308 GB used, 142 TB / 178 TB >>>> avail; 634KB/s rd, 487KB/s wr, 90op/s 2014-09-04 11:56:29.510270 >>>> osd.22 [WRN] 15 slow requests, 1 included below; oldest blocked for >>>>> 44.745754 secs 2014-09-04 11:56:29.510276 osd.22 [WRN] slow >>>>> request 30.999821 seconds >>>> old, received at 2014-09-04 11:55:58.510424: >>>> osd_op(client.10731617.0:81868956 >>>> rbd_data.967e022eb141f2.0000000000000e72 [write 0~4194304] 3.c585cebe >>>> e13901) v4 currently waiting for subops from [37,13] 2014-09-04 >>>> 11:56:30.510528 osd.22 [WRN] 21 slow requests, 6 included below; >>>> oldest blocked for > 45.745989 secs 2014-09-04 11:56:30.510534 >>>> osd.22 [WRN] slow request 30.122555 seconds old, received at >>>> 2014-09-04 11:56:00.387925: osd_op(client.13425082.0:11962345 >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write 1114112~8192] >>>> 3.c9e49140 e13901) v4 currently waiting for subops from [13,42] >>>> 2014-09-04 11:56:30.510537 osd.22 [WRN] slow request 30.122362 >>>> seconds old, received at 2014-09-04 11:56:00.388118: >>>> osd_op(client.13425082.0:11962352 >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write 1134592~4096] >>>> 3.c9e49140 e13901) v4 currently waiting for subops from [13,42] >>>> 2014-09-04 11:56:30.510541 osd.22 [WRN] slow request 30.122298 >>>> seconds old, received at 2014-09-04 11:56:00.388182: >>>> osd_op(client.13425082.0:11962353 >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write 4046848~8192] >>>> 3.c9e49140 e13901) v4 currently waiting for subops from [13,42] >>>> 2014-09-04 11:56:30.510544 osd.22 [WRN] slow request 30.121577 >>>> seconds old, received at 2014-09-04 11:56:00.388903: >>>> osd_op(client.13425082.0:11962374 >>>> rbd_data.54f24c3d1b58ba.00000000000047f2 [stat,write 2527232~4096] >>>> 3.cd9a9015 e13901) v4 currently waiting for subops from [45,1] >>>> 2014-09-04 11:56:30.510548 osd.22 [WRN] slow request 30.121518 >>>> seconds old, received at 2014-09-04 11:56:00.388962: >>>> osd_op(client.13425082.0:11962375 >>>> rbd_data.54f24c3d1b58ba.00000000000047f2 [stat,write 3133440~4096] >>>> 3.cd9a9015 e13901) v4 currently waiting for subops from [45,1] >>>> 2014-09-04 11:56:31.510706 osd.22 [WRN] 26 slow requests, 6 included >>>> below; oldest blocked for > 46.746163 secs 2014-09-04 11:56:31.510711 >>>> osd.22 [WRN] slow request 31.035418 seconds old, received at >>>> 2014-09-04 11:56:00.475236: osd_op(client.9266625.0:135900595 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2097152~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:31.510715 osd.22 [WRN] slow request 31.035335 >>>> seconds old, received at 2014-09-04 11:56:00.475319: >>>> osd_op(client.9266625.0:135900596 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2162688~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:31.510718 osd.22 [WRN] slow request 31.035270 >>>> seconds old, received at 2014-09-04 11:56:00.475384: >>>> osd_op(client.9266625.0:135900597 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2400256~16384] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:31.510721 osd.22 [WRN] slow request 31.035093 >>>> seconds old, received at 2014-09-04 11:56:00.475561: >>>> osd_op(client.9266625.0:135900598 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2420736~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:31.510724 osd.22 [WRN] slow request 31.034990 >>>> seconds old, received at 2014-09-04 11:56:00.475664: >>>> osd_op(client.9266625.0:135900599 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2428928~12288] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.510864 osd.22 [WRN] 28 slow requests, 6 included >>>> below; oldest blocked for > 47.746324 secs 2014-09-04 11:56:32.510871 >>>> osd.22 [WRN] slow request 32.035026 seconds old, received at >>>> 2014-09-04 11:56:00.475789: osd_op(client.9266625.0:135900600 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2453504~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.510876 osd.22 [WRN] slow request 32.034982 >>>> seconds old, received at 2014-09-04 11:56:00.475833: >>>> osd_op(client.9266625.0:135900601 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2502656~16384] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.510881 osd.22 [WRN] slow request 32.034917 >>>> seconds old, received at 2014-09-04 11:56:00.475898: >>>> osd_op(client.9266625.0:135900602 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2600960~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.510885 osd.22 [WRN] slow request 32.034807 >>>> seconds old, received at 2014-09-04 11:56:00.476008: >>>> osd_op(client.9266625.0:135900603 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2617344~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.510890 osd.22 [WRN] slow request 32.034756 >>>> seconds old, received at 2014-09-04 11:56:00.476059: >>>> osd_op(client.9266625.0:135900604 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2637824~69632] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:32.723329 osd.17 [WRN] 4 slow requests, 4 included >>>> below; oldest blocked for > 30.605794 secs 2014-09-04 11:56:32.723335 >>>> osd.17 [WRN] slow request 30.605794 seconds old, received at >>>> 2014-09-04 11:56:02.117495: osd_op(client.12646411.0:31118135 >>>> rbd_data.81aaf52eb141f2.0000000000000e4b [stat,write 0~4194304] >>>> 3.716b373b e13901) v4 currently waiting for subops from [31,1] >>>> 2014-09-04 11:56:32.723338 osd.17 [WRN] slow request 30.362042 >>>> seconds old, received at 2014-09-04 11:56:02.361247: >>>> osd_op(client.12485793.0:121786489 >>>> rbd_data.4ae87f2eb141f2.000000000000e200 [stat,write 2834432~4096] >>>> 3.6b43de72 e13901) v4 currently waiting for subops from [5,31] >>>> 2014-09-04 11:56:32.723341 osd.17 [WRN] slow request 30.179681 >>>> seconds old, received at 2014-09-04 11:56:02.543608: >>>> osd_op(client.13425079.0:7014393 >>>> rbd_data.54f2522eb141f2.0000000000014d01 [stat,write 2502656~4096] >>>> 3.d8ceea89 e13901) v4 currently waiting for subops from [31,18] >>>> 2014-09-04 11:56:32.723344 osd.17 [WRN] slow request 30.000568 >>>> seconds old, received at 2014-09-04 11:56:02.722721: >>>> osd_op(client.9281279.0:144230325 >>>> rbd_data.2d99fc2eb141f2.000000000000e80a [stat,write 1589248~4096] >>>> 3.16dc0e89 e13901) v4 currently waiting for subops from [31,18] >>>> 2014-09-04 11:56:33.511043 osd.22 [WRN] 33 slow requests, 6 included >>>> below; oldest blocked for > 48.746486 secs 2014-09-04 11:56:33.511049 >>>> osd.22 [WRN] slow request 33.034756 seconds old, received at >>>> 2014-09-04 11:56:00.476221: osd_op(client.9266625.0:135900605 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2715648~8192] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:33.511053 osd.22 [WRN] slow request 33.034620 >>>> seconds old, received at 2014-09-04 11:56:00.476357: >>>> osd_op(client.9266625.0:135900606 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2756608~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:33.511056 osd.22 [WRN] slow request 33.034446 >>>> seconds old, received at 2014-09-04 11:56:00.476531: >>>> osd_op(client.9266625.0:135900607 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2809856~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:33.511060 osd.22 [WRN] slow request 33.034387 >>>> seconds old, received at 2014-09-04 11:56:00.476590: >>>> osd_op(client.9266625.0:135900608 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2883584~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:33.511063 osd.22 [WRN] slow request 33.034341 >>>> seconds old, received at 2014-09-04 11:56:00.476636: >>>> osd_op(client.9266625.0:135900609 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2891776~8192] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:34.511206 osd.22 [WRN] 38 slow requests, 6 included >>>> below; oldest blocked for > 49.746664 secs 2014-09-04 11:56:34.511215 >>>> osd.22 [WRN] slow request 34.034426 seconds old, received at >>>> 2014-09-04 11:56:00.476729: osd_op(client.9266625.0:135900610 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2908160~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:34.511219 osd.22 [WRN] slow request 34.034298 >>>> seconds old, received at 2014-09-04 11:56:00.476857: >>>> osd_op(client.9266625.0:135900611 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2932736~28672] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:34.511222 osd.22 [WRN] slow request 34.034234 >>>> seconds old, received at 2014-09-04 11:56:00.476921: >>>> osd_op(client.9266625.0:135900612 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2969600~8192] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:34.511225 osd.22 [WRN] slow request 34.034147 >>>> seconds old, received at 2014-09-04 11:56:00.477008: >>>> osd_op(client.9266625.0:135900613 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2981888~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:34.511228 osd.22 [WRN] slow request 34.034055 >>>> seconds old, received at 2014-09-04 11:56:00.477100: >>>> osd_op(client.9266625.0:135900614 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 2998272~12288] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:35.153813 osd.37 [WRN] 1 slow requests, 1 included >>>> below; oldest blocked for > 30.307864 secs 2014-09-04 11:56:35.153820 >>>> osd.37 [WRN] slow request 30.307864 seconds old, received at >>>> 2014-09-04 11:56:04.845917: osd_op(client.10785413.0:129409910 >>>> rbd_data.96803f2eb141f2.00000000000042b0 [stat,write 1712128~4096] >>>> 3.6c5ac950 e13901) v4 currently waiting for subops from [22,41] >>>> 2014-09-04 11:56:35.511450 osd.22 [WRN] 42 slow requests, 6 included >>>> below; oldest blocked for > 50.746638 secs 2014-09-04 11:56:35.511455 >>>> osd.22 [WRN] slow request 35.034219 seconds old, received at >>>> 2014-09-04 11:56:00.477178: osd_op(client.9266625.0:135900615 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 3022848~8192] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:35.511459 osd.22 [WRN] slow request 35.034143 >>>> seconds old, received at 2014-09-04 11:56:00.477254: >>>> osd_op(client.9266625.0:135900616 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 3035136~12288] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:35.511462 osd.22 [WRN] slow request 35.034082 >>>> seconds old, received at 2014-09-04 11:56:00.477315: >>>> osd_op(client.9266625.0:135900617 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 3055616~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:35.511465 osd.22 [WRN] slow request 35.033965 >>>> seconds old, received at 2014-09-04 11:56:00.477432: >>>> osd_op(client.9266625.0:135900618 >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write 3063808~4096] >>>> 3.a2894ebe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:35.511469 osd.22 [WRN] slow request 34.162275 >>>> seconds old, received at 2014-09-04 11:56:01.349122: >>>> osd_op(client.12485646.0:110385143 >>>> rbd_data.4ccab12eb141f2.0000000000015b66 [stat,write 909312~4096] >>>> 3.9c416abe e13901) v4 currently waiting for subops from [37,13] >>>> 2014-09-04 11:56:36.511603 osd.22 [WRN] 34 slow requests, 6 included >>>> below; oldest blocked for > 51.746797 secs 2014-09-04 11:56:36.511609 >>>> osd.22 [WRN] slow request 34.401476 seconds old, received at >>>> 2014-09-04 11:56:02.110080: osd_op(client.12485643.0:117955489 >>>> rbd_data.4aad332eb141f2.0000000000002c00 [stat,write 2531328~4096] >>>> 3.b985ec92 e13901) v4 currently waiting for subops from [33,1] >>>> 2014-09-04 11:56:36.511612 osd.22 [WRN] slow request 34.149511 >>>> seconds old, received at 2014-09-04 11:56:02.362045: >>>> osd_op(client.12485793.0:121786505 >>>> rbd_data.4ae87f2eb141f2.0000000000018a0a [stat,write 3342336~4096] >>>> 3.26907492 e13901) v4 currently waiting for subops from [33,1] >>>> 2014-09-04 11:56:36.511615 osd.22 [WRN] slow request 34.329528 >>>> seconds old, received at 2014-09-04 11:56:02.182028: >>>> osd_op(client.12646411.0:31118266 >>>> rbd_data.81aaf52eb141f2.0000000000000e4f [stat,write 0~4194304] >>>> 3.168c978d e13901) v4 currently waiting for subops from [49,29] >>>> 2014-09-04 11:56:36.511618 osd.22 [WRN] slow request 31.871976 >>>> seconds old, received at 2014-09-04 11:56:04.639580: >>>> osd_op(client.12485793.0:121786506 >>>> rbd_data.4ae87f2eb141f2.0000000000018a0a [stat,write 0~4096] >>>> 3.26907492 e13901) v4 currently waiting for subops from [33,1] >>>> 2014-09-04 11:56:36.511621 osd.22 [WRN] slow request 31.008867 >>>> seconds old, received at 2014-09-04 11:56:05.502689: >>>> osd_op(client.9266625.0:135901516 >>>> rbd_data.42d6792eb141f2.0000000000017436 [stat,write 2682880~4096] >>>> 3.32067b97 e13901) v4 currently waiting for subops from [12,29] >>>> 2014-09-04 11:56:36.723882 osd.17 [WRN] 5 slow requests, 1 included >>>> below; oldest blocked for > 34.606349 secs 2014-09-04 11:56:36.723887 >>>> osd.17 [WRN] slow request 30.603057 seconds old, received at >>>> 2014-09-04 11:56:06.120787: osd_op(client.12483520.0:11473870 >>>> rbd_data.4b8e9b3d1b58ba.0000000000004600 [stat,write 2293760~4096] >>>> 3.49398689 e13901) v4 currently waiting for subops from [31,18] >>>> 2014-09-04 11:56:37.511787 osd.22 [WRN] 24 slow requests, 6 included >>>> below; oldest blocked for > 52.746984 secs 2014-09-04 11:56:37.511793 >>>> osd.22 [WRN] slow request 32.008986 seconds old, received at >>>> 2014-09-04 11:56:05.502757: osd_op(client.9266625.0:135901517 >>>> rbd_data.42d6792eb141f2.0000000000017436 [stat,write 2695168~4096] >>>> 3.32067b97 e13901) v4 currently waiting for subops from [12,29] >>>> 2014-09-04 11:56:37.511798 osd.22 [WRN] slow request 31.749483 >>>> seconds old, received at 2014-09-04 11:56:05.762260: >>>> osd_op(client.9281282.0:55728036 >>>> rbd_data.2d99f62eb141f2.0000000000003753 [stat,write 1630208~8192] >>>> 3.6625df97 e13901) v4 currently waiting for subops from [12,29] >>>> 2014-09-04 11:56:37.511801 osd.22 [WRN] slow request 31.745558 >>>> seconds old, received at 2014-09-04 11:56:05.766185: >>>> osd_op(client.9281282.0:55728038 >>>> rbd_data.2d99f62eb141f2.0000000000003753 [stat,write 1941504~4096] >>>> 3.6625df97 e13901) v4 currently waiting for subops from [12,29] >>>> 2014-09-04 11:56:37.511803 osd.22 [WRN] slow request 31.390945 >>>> seconds old, received at 2014-09-04 11:56:06.120798: >>>> osd_op(client.12483520.0:11473871 >>>> rbd_data.4b8e9b3d1b58ba.0000000000005c00 [stat,write 2424832~8192] >>>> 3.47dbbb97 e13901) v4 currently waiting for subops from [12,29] >>>> 2014-09-04 11:56:37.511806 osd.22 [WRN] slow request 31.390357 >>>> seconds old, received at 2014-09-04 11:56:06.121386: >>>> osd_op(client.12483520.0:11473884 >>>> rbd_data.4b8e9b3d1b58ba.0000000000005c00 [stat,write 2457600~16384] >>>> 3.47dbbb97 e13901) v4 currently waiting for subops from [12,29] >>>> >>>> Kind Regards, >>>> >>>> David >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users at lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> >>> -- >>> Christian Balzer Network/Systems Engineer >>> chibi at gol.com Global OnLine Japan/Fusion Communications >>> http://www.gol.com/ >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users at lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users at lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > -- > Christian Balzer Network/Systems Engineer > chibi at gol.com Global OnLine Japan/Fusion Communications > http://www.gol.com/ > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com