Huge issues with slow requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Sat, 6 Sep 2014 13:37:25 +0200 Josef Johansson wrote:

> Also putting this on the list.
> 
> On 06 Sep 2014, at 13:36, Josef Johansson <josef at oderland.se> wrote:
> 
> > Hi,
> > 
> > Same issues again, but I think we found the drive that causes the
> > problems.
> > 
> > But this is causing problems as it?s trying to do a recover to that
> > osd at the moment.
> > 
> > So we?re left with the status message 
> > 
> > 2014-09-06 13:35:07.580007 mon.0 [INF] pgmap v12678802: 6860 pgs: 6841
> > active+clean, 19 active+remapped+backfilling; 12299 GB data, 36882 GB
> > used, 142 TB / 178 TB avail; 1921KB/s rd, 192KB/s wr, 74op/s;
> > 41424/15131923 degraded (0.274%);  recovering 0 o/s, 2035KB/s
> > 
> > 
> > It?s improving, but way too slowly. If I restart the recovery (ceph
> > osd set no recovery /unset) it doesn?t change the osd what I can see.
> > 
> > Any ideas?
> > 
I don't know the state of your cluster, i.e. what caused the recovery to
start (how many OSDs went down?).
If you have a replication of 3 and only one OSD was involved, what is
stopping you from taking that wonky drive/OSD out?

If you don't know that or want to play it safe, how about setting the
weight of that OSD to 0? 
While that will AFAICT still result in all primary PGs to be evacuated
off it, no more writes will happen to it and reads might be faster.
In either case, it shouldn't slow down the rest of your cluster anymore.

Regards,

Christian
> > Cheers,
> > Josef
> > 
> > On 05 Sep 2014, at 11:26, Luis Periquito <luis.periquito at ocado.com>
> > wrote:
> > 
> >> Only time I saw such behaviour was when I was deleting a big chunk of
> >> data from the cluster: all the client activity was reduced, the op/s
> >> were almost non-existent and there was unjustified delays all over
> >> the cluster. But all the disks were somewhat busy in atop/iotstat.
> >> 
> >> 
> >> On 5 September 2014 09:51, David <david at visions.se> wrote:
> >> Hi,
> >> 
> >> Indeed strange.
> >> 
> >> That output was when we had issues, seems that most operations were
> >> blocked / slow requests.
> >> 
> >> A ?baseline? output is more like today:
> >> 
> >> 2014-09-05 10:44:29.123681 mon.0 [INF] pgmap v12582759: 6860 pgs:
> >> 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 9273KB/s rd, 24650KB/s wr, 2755op/s 2014-09-05 10:44:30.125637
> >> mon.0 [INF] pgmap v12582760: 6860 pgs: 6860 active+clean; 12253 GB
> >> data, 36574 GB used, 142 TB / 178 TB avail; 9500KB/s rd, 20430KB/s
> >> wr, 2294op/s 2014-09-05 10:44:31.139427 mon.0 [INF] pgmap v12582761:
> >> 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB /
> >> 178 TB avail; 9216KB/s rd, 20062KB/s wr, 2488op/s 2014-09-05
> >> 10:44:32.144945 mon.0 [INF] pgmap v12582762: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 12511KB/s rd, 15739KB/s wr, 2488op/s 2014-09-05 10:44:33.161210 mon.0
> >> [INF] pgmap v12582763: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 18593KB/s rd, 14880KB/s wr,
> >> 2609op/s 2014-09-05 10:44:34.187294 mon.0 [INF] pgmap v12582764: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 17720KB/s rd, 22964KB/s wr, 3257op/s 2014-09-05
> >> 10:44:35.190785 mon.0 [INF] pgmap v12582765: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 19230KB/s rd, 18901KB/s wr, 3199op/s 2014-09-05 10:44:36.213535 mon.0
> >> [INF] pgmap v12582766: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 17630KB/s rd, 18855KB/s wr,
> >> 3131op/s 2014-09-05 10:44:37.220052 mon.0 [INF] pgmap v12582767: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 12262KB/s rd, 18627KB/s wr, 2595op/s 2014-09-05
> >> 10:44:38.233357 mon.0 [INF] pgmap v12582768: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 17697KB/s rd, 17572KB/s wr, 2156op/s 2014-09-05 10:44:39.239409 mon.0
> >> [INF] pgmap v12582769: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 20300KB/s rd, 19735KB/s wr,
> >> 2197op/s 2014-09-05 10:44:40.260423 mon.0 [INF] pgmap v12582770: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 14656KB/s rd, 15460KB/s wr, 2199op/s 2014-09-05
> >> 10:44:41.269736 mon.0 [INF] pgmap v12582771: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 8969KB/s rd, 11918KB/s wr, 1951op/s 2014-09-05 10:44:42.276192 mon.0
> >> [INF] pgmap v12582772: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 7272KB/s rd, 10644KB/s wr,
> >> 1832op/s 2014-09-05 10:44:43.291817 mon.0 [INF] pgmap v12582773: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 9316KB/s rd, 16610KB/s wr, 2412op/s 2014-09-05 10:44:44.295469
> >> mon.0 [INF] pgmap v12582774: 6860 pgs: 6860 active+clean; 12253 GB
> >> data, 36574 GB used, 142 TB / 178 TB avail; 9257KB/s rd, 19953KB/s
> >> wr, 2633op/s 2014-09-05 10:44:45.315774 mon.0 [INF] pgmap v12582775:
> >> 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB /
> >> 178 TB avail; 9718KB/s rd, 14298KB/s wr, 2101op/s 2014-09-05
> >> 10:44:46.326783 mon.0 [INF] pgmap v12582776: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 20877KB/s rd, 12822KB/s wr, 2447op/s 2014-09-05 10:44:47.327537 mon.0
> >> [INF] pgmap v12582777: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 18447KB/s rd, 12945KB/s wr,
> >> 2226op/s 2014-09-05 10:44:48.348725 mon.0 [INF] pgmap v12582778: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 15514KB/s rd, 19366KB/s wr, 1892op/s 2014-09-05
> >> 10:44:49.357537 mon.0 [INF] pgmap v12582779: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 16640KB/s rd, 19694KB/s wr, 1887op/s 2014-09-05 10:44:50.365016 mon.0
> >> [INF] pgmap v12582780: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 9139KB/s rd, 10734KB/s wr,
> >> 1566op/s 2014-09-05 10:44:51.385553 mon.0 [INF] pgmap v12582781: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 11493KB/s rd, 13827KB/s wr, 1928op/s 2014-09-05
> >> 10:44:52.389189 mon.0 [INF] pgmap v12582782: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 12922KB/s rd, 12814KB/s wr, 1670op/s 2014-09-05 10:44:53.403735 mon.0
> >> [INF] pgmap v12582783: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 14880KB/s rd, 17532KB/s wr,
> >> 2145op/s 2014-09-05 10:44:54.410581 mon.0 [INF] pgmap v12582784: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 15419KB/s rd, 19696KB/s wr, 2259op/s 2014-09-05
> >> 10:44:55.420308 mon.0 [INF] pgmap v12582785: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 10964KB/s rd, 15739KB/s wr, 1701op/s 2014-09-05 10:44:56.445017 mon.0
> >> [INF] pgmap v12582786: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 16195KB/s rd, 15554KB/s wr,
> >> 2444op/s 2014-09-05 10:44:57.452136 mon.0 [INF] pgmap v12582787: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 15016KB/s rd, 16356KB/s wr, 2358op/s 2014-09-05
> >> 10:44:58.465958 mon.0 [INF] pgmap v12582788: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 11668KB/s rd, 18443KB/s wr, 2029op/s 2014-09-05 10:44:59.483462 mon.0
> >> [INF] pgmap v12582789: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 12253KB/s rd, 10846KB/s wr,
> >> 1529op/s 2014-09-05 10:45:00.492322 mon.0 [INF] pgmap v12582790: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 12247KB/s rd, 7084KB/s wr, 1464op/s 2014-09-05 10:45:01.516581
> >> mon.0 [INF] pgmap v12582791: 6860 pgs: 6860 active+clean; 12253 GB
> >> data, 36574 GB used, 142 TB / 178 TB avail; 16460KB/s rd, 12089KB/s
> >> wr, 2537op/s 2014-09-05 10:45:02.527110 mon.0 [INF] pgmap v12582792:
> >> 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB /
> >> 178 TB avail; 13382KB/s rd, 15080KB/s wr, 2563op/s 2014-09-05
> >> 10:45:03.538090 mon.0 [INF] pgmap v12582793: 6860 pgs: 6860
> >> active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 10902KB/s rd, 18745KB/s wr, 2863op/s 2014-09-05 10:45:04.558261 mon.0
> >> [INF] pgmap v12582794: 6860 pgs: 6860 active+clean; 12253 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 10850KB/s rd, 15995KB/s wr,
> >> 2695op/s 2014-09-05 10:45:05.565750 mon.0 [INF] pgmap v12582795: 6860
> >> pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 9636KB/s rd, 13262KB/s wr, 2372op/s 2014-09-05 10:45:06.593984
> >> mon.0 [INF] pgmap v12582796: 6860 pgs: 6860 active+clean; 12253 GB
> >> data, 36574 GB used, 142 TB / 178 TB avail; 18808KB/s rd, 19329KB/s
> >> wr, 3819op/s 2014-09-05 10:45:07.595866 mon.0 [INF] pgmap v12582797:
> >> 6860 pgs: 6860 active+clean; 12253 GB data, 36574 GB used, 142 TB /
> >> 178 TB avail; 21265KB/s rd, 20743KB/s wr, 3861op/s 2014-09-05
> >> 10:45:08.624949 mon.0 [INF] pgmap v12582798: 6860 pgs: 6860
> >> active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 20114KB/s rd, 18543KB/s wr, 3248op/s 2014-09-05 10:45:09.627901 mon.0
> >> [INF] pgmap v12582799: 6860 pgs: 6860 active+clean; 12254 GB data,
> >> 36574 GB used, 142 TB / 178 TB avail; 14717KB/s rd, 15141KB/s wr,
> >> 2302op/s 2014-09-05 10:45:10.643234 mon.0 [INF] pgmap v12582800: 6860
> >> pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB
> >> avail; 8328KB/s rd, 13950KB/s wr, 1919op/s 2014-09-05 10:45:11.651602
> >> mon.0 [INF] pgmap v12582801: 6860 pgs: 6860 active+clean; 12254 GB
> >> data, 36574 GB used, 142 TB / 178 TB avail; 16978KB/s rd, 15921KB/s
> >> wr, 3377op/s 2014-09-05 10:45:12.674819 mon.0 [INF] pgmap v12582802:
> >> 6860 pgs: 6860 active+clean; 12254 GB data, 36574 GB used, 142 TB /
> >> 178 TB avail; 16471KB/s rd, 14034KB/s wr, 3379op/s 2014-09-05
> >> 10:45:13.688080 mon.0 [INF] pgmap v12582803: 6860 pgs: 6860
> >> active+clean; 12254 GB data, 36574 GB used, 142 TB / 178 TB avail;
> >> 16149KB/s rd, 12657KB/s wr, 2734op/s
> >> 
> >> Aye, we actually saw latency on the disks go up a bit when we had
> >> 128GB of RAM on the OSDs and decided to beef them up to 256GB which
> >> helped. They?re running different workloads (shared hosting) but
> >> we?ve never encountered the issue we had yesterday even during our
> >> testing/benchmarking.
> >> 
> >> Kind Regards,
> >> David
> >> 
> >> 5 sep 2014 kl. 09:05 skrev Christian Balzer <chibi at gol.com>:
> >> 
> >> >
> >> > Hello,
> >> >
> >> > On Fri, 5 Sep 2014 08:26:47 +0200 David wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> Sorry for the lack of information yesterday, this was "solved"
> >> >> after some 30 minutes, after having reloaded/restarted all osd
> >> >> daemons. Unfortunately we couldn?t pin point it to a single OSD or
> >> >> drive, all drives seemed ok, some had a bit higher latency and we
> >> >> tried to out / in them to see if it had any effect which it did
> >> >> not.
> >> >>
> >> > This is odd.
> >> > Having it "fixed" by restarting all OSDs would suggest either a
> >> > software problem (bug) with Ceph or some resource other than the
> >> > storage system being starved. But memory seems unlikely, even with
> >> > bloated, leaking OSD daemon. And CPU seems even less likely.
> >> >
> >> >> The cluster consists of 3 mon servrers, 5 OSD servers with 10
> >> >> enterprise HDDs backed with 2 S3700 SSDs for journals each. OSD
> >> >> servers have 256GB of RAM, 2x E5-2630 v2 @ 2.60GHz CPUs.
> >> >>
> >> >> The log that I posted yesterday was just a small taste of the full
> >> >> one ;) They were all pointing to different osd?s that they were
> >> >> waiting for. We?re also monitoring all of the VMs running on KVM,
> >> >> and we didn?t see any exceptional throughput or iops usage before
> >> >> or during this event. We were checking iostat etc and nothing was
> >> >> out of the ordinary..
> >> >>
> >> >> Going to double check SMART and also see if we can off load some
> >> >> of the cluster in any way. If you have any other advice that?d be
> >> >> appreciated :)
> >> >>
> >> > If the mon output below is typical for your cluster (634KB/s rd,
> >> > 487KB/s wr, 90op/s) it should be able to handle that w/o breaking a
> >> > sweat. I guestimate it at about 13K write IOPS and your reads
> >> > should (after a while) come mostly from the nice big pagecache you
> >> > have on those 256GB RAM machines.
> >> >
> >> > Christian
> >> >
> >> >> Thanks for your help!
> >> >>
> >> >> Kind Regards,
> >> >> David
> >> >>
> >> >> 5 sep 2014 kl. 07:30 skrev Martin B Nielsen <martin at unity3d.com>:
> >> >>
> >> >>> Just echoing what Christian said.
> >> >>>
> >> >>> Also, iirc the "currently waiting for subobs on [" could also
> >> >>> mean a problem on those as it waits for ack from them (I might
> >> >>> remember wrong).
> >> >>>
> >> >>> If that is the case you might want to check in on osd 13 & 37 as
> >> >>> well.
> >> >>>
> >> >>> With the cluster load and size you should not have this problem;
> >> >>> I'm pretty sure you're dealing with a rogue/faulty osd/node
> >> >>> somewhere.
> >> >>>
> >> >>> Cheers,
> >> >>> Martin
> >> >>>
> >> >>>
> >> >>> On Fri, Sep 5, 2014 at 2:28 AM, Christian Balzer <chibi at gol.com>
> >> >>> wrote: On Thu, 4 Sep 2014 12:02:13 +0200 David wrote:
> >> >>>
> >> >>>> Hi,
> >> >>>>
> >> >>>> We?re running a ceph cluster with version:
> >> >>>>
> >> >>>> 0.67.7-1~bpo70+1
> >> >>>>
> >> >>>> All of a sudden we?re having issues with the cluster (running RBD
> >> >>>> images for kvm) with slow requests on all of the OSD servers. Any
> >> >>>> idea why and how to fix it?
> >> >>>>
> >> >>> You give us a Ceph version at least, but for anybody to make
> >> >>> guesses we need much more information than a log spew.
> >> >>>
> >> >>> How many nodes/OSDs, OS, hardware, OSD details (FS, journals on
> >> >>> SSDs), etc.
> >> >>>
> >> >>> Run atop (in a sufficiently large terminal) on all your nodes,
> >> >>> see if you can spot a bottleneck, like a disk being at 100% all
> >> >>> the time with a much higher avio than the others.
> >> >>> Looking at your logs, I'd pay particular attention to the disk
> >> >>> holding osd.22.
> >> >>> A single slow disk can bring a whole large cluster to a crawl.
> >> >>> If you're using a hardware controller with a battery backed up
> >> >>> cache, check if that is fine, loss of the battery would switch
> >> >>> from writeback to writethrough and massively slow down IOPS.
> >> >>>
> >> >>> Regards,
> >> >>>
> >> >>> Christian
> >> >>>>
> >> >>>> 2014-09-04 11:56:35.868521 mon.0 [INF] pgmap v12504451: 6860 pgs:
> >> >>>> 6860 active+clean; 12163 GB data, 36308 GB used, 142 TB / 178 TB
> >> >>>> avail; 634KB/s rd, 487KB/s wr, 90op/s 2014-09-04 11:56:29.510270
> >> >>>> osd.22 [WRN] 15 slow requests, 1 included below; oldest blocked
> >> >>>> for
> >> >>>>> 44.745754 secs 2014-09-04 11:56:29.510276 osd.22 [WRN] slow
> >> >>>>> request 30.999821 seconds
> >> >>>> old, received at 2014-09-04 11:55:58.510424:
> >> >>>> osd_op(client.10731617.0:81868956
> >> >>>> rbd_data.967e022eb141f2.0000000000000e72 [write 0~4194304]
> >> >>>> 3.c585cebe e13901) v4 currently waiting for subops from [37,13]
> >> >>>> 2014-09-04 11:56:30.510528 osd.22 [WRN] 21 slow requests, 6
> >> >>>> included below; oldest blocked for > 45.745989 secs 2014-09-04
> >> >>>> 11:56:30.510534 osd.22 [WRN] slow request 30.122555 seconds old,
> >> >>>> received at 2014-09-04 11:56:00.387925:
> >> >>>> osd_op(client.13425082.0:11962345
> >> >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write
> >> >>>> 1114112~8192] 3.c9e49140 e13901) v4 currently waiting for subops
> >> >>>> from [13,42] 2014-09-04 11:56:30.510537 osd.22 [WRN] slow
> >> >>>> request 30.122362 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.388118: osd_op(client.13425082.0:11962352
> >> >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write
> >> >>>> 1134592~4096] 3.c9e49140 e13901) v4 currently waiting for subops
> >> >>>> from [13,42] 2014-09-04 11:56:30.510541 osd.22 [WRN] slow
> >> >>>> request 30.122298 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.388182: osd_op(client.13425082.0:11962353
> >> >>>> rbd_data.54f24c3d1b58ba.0000000000003753 [stat,write
> >> >>>> 4046848~8192] 3.c9e49140 e13901) v4 currently waiting for subops
> >> >>>> from [13,42] 2014-09-04 11:56:30.510544 osd.22 [WRN] slow
> >> >>>> request 30.121577 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.388903: osd_op(client.13425082.0:11962374
> >> >>>> rbd_data.54f24c3d1b58ba.00000000000047f2 [stat,write
> >> >>>> 2527232~4096] 3.cd9a9015 e13901) v4 currently waiting for subops
> >> >>>> from [45,1] 2014-09-04 11:56:30.510548 osd.22 [WRN] slow request
> >> >>>> 30.121518 seconds old, received at 2014-09-04 11:56:00.388962:
> >> >>>> osd_op(client.13425082.0:11962375
> >> >>>> rbd_data.54f24c3d1b58ba.00000000000047f2 [stat,write
> >> >>>> 3133440~4096] 3.cd9a9015 e13901) v4 currently waiting for subops
> >> >>>> from [45,1] 2014-09-04 11:56:31.510706 osd.22 [WRN] 26 slow
> >> >>>> requests, 6 included below; oldest blocked for > 46.746163 secs
> >> >>>> 2014-09-04 11:56:31.510711 osd.22 [WRN] slow request 31.035418
> >> >>>> seconds old, received at 2014-09-04 11:56:00.475236:
> >> >>>> osd_op(client.9266625.0:135900595
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2097152~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:31.510715 osd.22 [WRN] slow
> >> >>>> request 31.035335 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475319: osd_op(client.9266625.0:135900596
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2162688~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:31.510718 osd.22 [WRN] slow
> >> >>>> request 31.035270 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475384: osd_op(client.9266625.0:135900597
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2400256~16384] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:31.510721 osd.22 [WRN] slow
> >> >>>> request 31.035093 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475561: osd_op(client.9266625.0:135900598
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2420736~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:31.510724 osd.22 [WRN] slow
> >> >>>> request 31.034990 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475664: osd_op(client.9266625.0:135900599
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2428928~12288] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:32.510864 osd.22 [WRN] 28
> >> >>>> slow requests, 6 included below; oldest blocked for > 47.746324
> >> >>>> secs 2014-09-04 11:56:32.510871 osd.22 [WRN] slow request
> >> >>>> 32.035026 seconds old, received at 2014-09-04 11:56:00.475789:
> >> >>>> osd_op(client.9266625.0:135900600
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2453504~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:32.510876 osd.22 [WRN] slow
> >> >>>> request 32.034982 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475833: osd_op(client.9266625.0:135900601
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2502656~16384] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:32.510881 osd.22 [WRN] slow
> >> >>>> request 32.034917 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.475898: osd_op(client.9266625.0:135900602
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2600960~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:32.510885 osd.22 [WRN] slow
> >> >>>> request 32.034807 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476008: osd_op(client.9266625.0:135900603
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2617344~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:32.510890 osd.22 [WRN] slow
> >> >>>> request 32.034756 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476059: osd_op(client.9266625.0:135900604
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2637824~69632] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:32.723329 osd.17 [WRN] 4
> >> >>>> slow requests, 4 included below; oldest blocked for > 30.605794
> >> >>>> secs 2014-09-04 11:56:32.723335 osd.17 [WRN] slow request
> >> >>>> 30.605794 seconds old, received at 2014-09-04 11:56:02.117495:
> >> >>>> osd_op(client.12646411.0:31118135
> >> >>>> rbd_data.81aaf52eb141f2.0000000000000e4b [stat,write 0~4194304]
> >> >>>> 3.716b373b e13901) v4 currently waiting for subops from [31,1]
> >> >>>> 2014-09-04 11:56:32.723338 osd.17 [WRN] slow request 30.362042
> >> >>>> seconds old, received at 2014-09-04 11:56:02.361247:
> >> >>>> osd_op(client.12485793.0:121786489
> >> >>>> rbd_data.4ae87f2eb141f2.000000000000e200 [stat,write
> >> >>>> 2834432~4096] 3.6b43de72 e13901) v4 currently waiting for subops
> >> >>>> from [5,31] 2014-09-04 11:56:32.723341 osd.17 [WRN] slow request
> >> >>>> 30.179681 seconds old, received at 2014-09-04 11:56:02.543608:
> >> >>>> osd_op(client.13425079.0:7014393
> >> >>>> rbd_data.54f2522eb141f2.0000000000014d01 [stat,write
> >> >>>> 2502656~4096] 3.d8ceea89 e13901) v4 currently waiting for subops
> >> >>>> from [31,18] 2014-09-04 11:56:32.723344 osd.17 [WRN] slow
> >> >>>> request 30.000568 seconds old, received at 2014-09-04
> >> >>>> 11:56:02.722721: osd_op(client.9281279.0:144230325
> >> >>>> rbd_data.2d99fc2eb141f2.000000000000e80a [stat,write
> >> >>>> 1589248~4096] 3.16dc0e89 e13901) v4 currently waiting for subops
> >> >>>> from [31,18] 2014-09-04 11:56:33.511043 osd.22 [WRN] 33 slow
> >> >>>> requests, 6 included below; oldest blocked for > 48.746486 secs
> >> >>>> 2014-09-04 11:56:33.511049 osd.22 [WRN] slow request 33.034756
> >> >>>> seconds old, received at 2014-09-04 11:56:00.476221:
> >> >>>> osd_op(client.9266625.0:135900605
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2715648~8192] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:33.511053 osd.22 [WRN] slow
> >> >>>> request 33.034620 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476357: osd_op(client.9266625.0:135900606
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2756608~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:33.511056 osd.22 [WRN] slow
> >> >>>> request 33.034446 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476531: osd_op(client.9266625.0:135900607
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2809856~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:33.511060 osd.22 [WRN] slow
> >> >>>> request 33.034387 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476590: osd_op(client.9266625.0:135900608
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2883584~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:33.511063 osd.22 [WRN] slow
> >> >>>> request 33.034341 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476636: osd_op(client.9266625.0:135900609
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2891776~8192] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:34.511206 osd.22 [WRN] 38 slow
> >> >>>> requests, 6 included below; oldest blocked for > 49.746664 secs
> >> >>>> 2014-09-04 11:56:34.511215 osd.22 [WRN] slow request 34.034426
> >> >>>> seconds old, received at 2014-09-04 11:56:00.476729:
> >> >>>> osd_op(client.9266625.0:135900610
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2908160~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:34.511219 osd.22 [WRN] slow
> >> >>>> request 34.034298 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476857: osd_op(client.9266625.0:135900611
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2932736~28672] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:34.511222 osd.22 [WRN] slow
> >> >>>> request 34.034234 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.476921: osd_op(client.9266625.0:135900612
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2969600~8192] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:34.511225 osd.22 [WRN] slow
> >> >>>> request 34.034147 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.477008: osd_op(client.9266625.0:135900613
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2981888~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:34.511228 osd.22 [WRN] slow
> >> >>>> request 34.034055 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.477100: osd_op(client.9266625.0:135900614
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 2998272~12288] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:35.153813 osd.37 [WRN] 1
> >> >>>> slow requests, 1 included below; oldest blocked for > 30.307864
> >> >>>> secs 2014-09-04 11:56:35.153820 osd.37 [WRN] slow request
> >> >>>> 30.307864 seconds old, received at 2014-09-04 11:56:04.845917:
> >> >>>> osd_op(client.10785413.0:129409910
> >> >>>> rbd_data.96803f2eb141f2.00000000000042b0 [stat,write
> >> >>>> 1712128~4096] 3.6c5ac950 e13901) v4 currently waiting for subops
> >> >>>> from [22,41] 2014-09-04 11:56:35.511450 osd.22 [WRN] 42 slow
> >> >>>> requests, 6 included below; oldest blocked for > 50.746638 secs
> >> >>>> 2014-09-04 11:56:35.511455 osd.22 [WRN] slow request 35.034219
> >> >>>> seconds old, received at 2014-09-04 11:56:00.477178:
> >> >>>> osd_op(client.9266625.0:135900615
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 3022848~8192] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:35.511459 osd.22 [WRN] slow
> >> >>>> request 35.034143 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.477254: osd_op(client.9266625.0:135900616
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 3035136~12288] 3.a2894ebe e13901) v4 currently waiting for
> >> >>>> subops from [37,13] 2014-09-04 11:56:35.511462 osd.22 [WRN] slow
> >> >>>> request 35.034082 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.477315: osd_op(client.9266625.0:135900617
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 3055616~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:35.511465 osd.22 [WRN] slow
> >> >>>> request 35.033965 seconds old, received at 2014-09-04
> >> >>>> 11:56:00.477432: osd_op(client.9266625.0:135900618
> >> >>>> rbd_data.42d6792eb141f2.000000000000bc00 [stat,write
> >> >>>> 3063808~4096] 3.a2894ebe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:35.511469 osd.22 [WRN] slow
> >> >>>> request 34.162275 seconds old, received at 2014-09-04
> >> >>>> 11:56:01.349122: osd_op(client.12485646.0:110385143
> >> >>>> rbd_data.4ccab12eb141f2.0000000000015b66 [stat,write
> >> >>>> 909312~4096] 3.9c416abe e13901) v4 currently waiting for subops
> >> >>>> from [37,13] 2014-09-04 11:56:36.511603 osd.22 [WRN] 34 slow
> >> >>>> requests, 6 included below; oldest blocked for > 51.746797 secs
> >> >>>> 2014-09-04 11:56:36.511609 osd.22 [WRN] slow request 34.401476
> >> >>>> seconds old, received at 2014-09-04 11:56:02.110080:
> >> >>>> osd_op(client.12485643.0:117955489
> >> >>>> rbd_data.4aad332eb141f2.0000000000002c00 [stat,write
> >> >>>> 2531328~4096] 3.b985ec92 e13901) v4 currently waiting for subops
> >> >>>> from [33,1] 2014-09-04 11:56:36.511612 osd.22 [WRN] slow request
> >> >>>> 34.149511 seconds old, received at 2014-09-04 11:56:02.362045:
> >> >>>> osd_op(client.12485793.0:121786505
> >> >>>> rbd_data.4ae87f2eb141f2.0000000000018a0a [stat,write
> >> >>>> 3342336~4096] 3.26907492 e13901) v4 currently waiting for subops
> >> >>>> from [33,1] 2014-09-04 11:56:36.511615 osd.22 [WRN] slow request
> >> >>>> 34.329528 seconds old, received at 2014-09-04 11:56:02.182028:
> >> >>>> osd_op(client.12646411.0:31118266
> >> >>>> rbd_data.81aaf52eb141f2.0000000000000e4f [stat,write 0~4194304]
> >> >>>> 3.168c978d e13901) v4 currently waiting for subops from [49,29]
> >> >>>> 2014-09-04 11:56:36.511618 osd.22 [WRN] slow request 31.871976
> >> >>>> seconds old, received at 2014-09-04 11:56:04.639580:
> >> >>>> osd_op(client.12485793.0:121786506
> >> >>>> rbd_data.4ae87f2eb141f2.0000000000018a0a [stat,write 0~4096]
> >> >>>> 3.26907492 e13901) v4 currently waiting for subops from [33,1]
> >> >>>> 2014-09-04 11:56:36.511621 osd.22 [WRN] slow request 31.008867
> >> >>>> seconds old, received at 2014-09-04 11:56:05.502689:
> >> >>>> osd_op(client.9266625.0:135901516
> >> >>>> rbd_data.42d6792eb141f2.0000000000017436 [stat,write
> >> >>>> 2682880~4096] 3.32067b97 e13901) v4 currently waiting for subops
> >> >>>> from [12,29] 2014-09-04 11:56:36.723882 osd.17 [WRN] 5 slow
> >> >>>> requests, 1 included below; oldest blocked for > 34.606349 secs
> >> >>>> 2014-09-04 11:56:36.723887 osd.17 [WRN] slow request 30.603057
> >> >>>> seconds old, received at 2014-09-04 11:56:06.120787:
> >> >>>> osd_op(client.12483520.0:11473870
> >> >>>> rbd_data.4b8e9b3d1b58ba.0000000000004600 [stat,write
> >> >>>> 2293760~4096] 3.49398689 e13901) v4 currently waiting for subops
> >> >>>> from [31,18] 2014-09-04 11:56:37.511787 osd.22 [WRN] 24 slow
> >> >>>> requests, 6 included below; oldest blocked for > 52.746984 secs
> >> >>>> 2014-09-04 11:56:37.511793 osd.22 [WRN] slow request 32.008986
> >> >>>> seconds old, received at 2014-09-04 11:56:05.502757:
> >> >>>> osd_op(client.9266625.0:135901517
> >> >>>> rbd_data.42d6792eb141f2.0000000000017436 [stat,write
> >> >>>> 2695168~4096] 3.32067b97 e13901) v4 currently waiting for subops
> >> >>>> from [12,29] 2014-09-04 11:56:37.511798 osd.22 [WRN] slow
> >> >>>> request 31.749483 seconds old, received at 2014-09-04
> >> >>>> 11:56:05.762260: osd_op(client.9281282.0:55728036
> >> >>>> rbd_data.2d99f62eb141f2.0000000000003753 [stat,write
> >> >>>> 1630208~8192] 3.6625df97 e13901) v4 currently waiting for subops
> >> >>>> from [12,29] 2014-09-04 11:56:37.511801 osd.22 [WRN] slow
> >> >>>> request 31.745558 seconds old, received at 2014-09-04
> >> >>>> 11:56:05.766185: osd_op(client.9281282.0:55728038
> >> >>>> rbd_data.2d99f62eb141f2.0000000000003753 [stat,write
> >> >>>> 1941504~4096] 3.6625df97 e13901) v4 currently waiting for subops
> >> >>>> from [12,29] 2014-09-04 11:56:37.511803 osd.22 [WRN] slow
> >> >>>> request 31.390945 seconds old, received at 2014-09-04
> >> >>>> 11:56:06.120798: osd_op(client.12483520.0:11473871
> >> >>>> rbd_data.4b8e9b3d1b58ba.0000000000005c00 [stat,write
> >> >>>> 2424832~8192] 3.47dbbb97 e13901) v4 currently waiting for subops
> >> >>>> from [12,29] 2014-09-04 11:56:37.511806 osd.22 [WRN] slow
> >> >>>> request 31.390357 seconds old, received at 2014-09-04
> >> >>>> 11:56:06.121386: osd_op(client.12483520.0:11473884
> >> >>>> rbd_data.4b8e9b3d1b58ba.0000000000005c00 [stat,write
> >> >>>> 2457600~16384] 3.47dbbb97 e13901) v4 currently waiting for
> >> >>>> subops from [12,29]
> >> >>>>
> >> >>>> Kind Regards,
> >> >>>>
> >> >>>> David
> >> >>>>
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> ceph-users mailing list
> >> >>>> ceph-users at lists.ceph.com
> >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >>>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Christian Balzer        Network/Systems Engineer
> >> >>> chibi at gol.com           Global OnLine Japan/Fusion Communications
> >> >>> http://www.gol.com/
> >> >>> _______________________________________________
> >> >>> ceph-users mailing list
> >> >>> ceph-users at lists.ceph.com
> >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >>>
> >> >>> _______________________________________________
> >> >>> ceph-users mailing list
> >> >>> ceph-users at lists.ceph.com
> >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >>
> >> >
> >> >
> >> > --
> >> > Christian Balzer        Network/Systems Engineer
> >> > chibi at gol.com         Global OnLine Japan/Fusion Communications
> >> > http://www.gol.com/
> >> > _______________________________________________
> >> > ceph-users mailing list
> >> > ceph-users at lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users at lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> 
> >> 
> >> 
> >> -- 
> >> Luis Periquito
> >> Unix Engineer
> >> 
> >> Ocado.com
> >> 
> >> Head Office, Titan Court, 3 Bishop Square, Hatfield Business Park,
> >> Hatfield, Herts AL10 9NE
> >> 
> >> Notice:  This email is confidential and may contain copyright
> >> material of members of the Ocado Group. Opinions and views expressed
> >> in this message may not necessarily reflect the opinions and views of
> >> the members of the Ocado Group.
> >> 
> >> If you are not the intended recipient, please notify us immediately
> >> and delete all copies of this message. Please note that it is your
> >> responsibility to scan this message for viruses. 
> >> 
> >> References to the ?Ocado Group? are to Ocado Group plc (registered in
> >> England and Wales with number 7098618) and its subsidiary
> >> undertakings (as that expression is defined in the Companies Act
> >> 2006) from time to time.  The registered office of Ocado Group plc is
> >> Titan Court, 3 Bishops Square, Hatfield Business Park, Hatfield,
> >> Herts. AL10 9NE.
> >> 
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users at lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux