On Mon, Dec 8, 2014 at 6:39 PM, Christian Balzer <chibi@xxxxxxx> wrote: > > Hello, > > Debian Jessie cluster, thus kernel 3.16, ceph 0.80.7. > 3 storage nodes with 8 OSDs (journals on 4 SSDs) each, 3 mons. > 2 compute nodes, everything connected via Infiniband. > > This is pre-production, currently there are only 3 VMs and 2 of them were > idle at the time. The non-idle one was having 600GB of maildirs copied > onto it, which stresses things but not Ceph as those millions of small > files coalesce nicely and result in rather few Ceph ops. > > A couple of hours into that copy marathon (the source FS and machine are > slow and rsync isn't particular speedy with this kind of operation either) > this happened: > --- > 2014-12-06 19:20:57.023974 osd.23 10.0.8.23:6815/3552 77 : [WRN] slow request 30 > .673939 seconds old, received at 2014-12-06 19:20:26.346746: osd_op(client.33776 > .0:743596 rb.0.819b.238e1f29.00000003f52f [set-alloc-hint object_size 4194304 wr > ite_size 4194304,write 1748992~4096] 3.efa97e35 ack+ondisk+write e380) v4 curren > tly waiting for subops from 4,8 > 2014-12-06 19:20:57.023991 osd.23 10.0.8.23:6815/3552 78 : [WRN] slow request 30 > .673886 seconds old, received at 2014-12-06 19:20:26.346799: osd_op(client.33776 > .0:743597 rb.0.819b.238e1f29.00000003f52f [set-alloc-hint object_size 4194304 wr > ite_size 4194304,write 1945600~4096] 3.efa97e35 ack+ondisk+write e380) v4 curren > tly waiting for subops from 4,8 > 2014-12-06 19:20:57.323976 osd.1 10.0.8.21:6815/4868 123 : [WRN] slow request 30 > .910821 seconds old, received at 2014-12-06 19:20:26.413051: osd_op(client.33776 > .0:743604 rb.0.819b.238e1f29.00000003e628 [set-alloc-hint object_size 4194304 wr > ite_size 4194304,write 1794048~1835008] 3.5e76b8ba ack+ondisk+write e380) v4 cur > rently waiting for subops from 8,17 > --- > > There were a few more later, but they all involved OSD 8 as common factor. > > Alas there's nothing in the osd-8.log indicating why: > --- > 2014-12-06 19:13:13.933636 7fce85552700 0 -- 10.0.8.22:6835/5389 >> 10.0.8.6:0/ > 716350435 pipe(0x7fcec3c25900 sd=23 :6835 s=0 pgs=0 cs=0 l=0 c=0x7fcebfad03c0).a > ccept peer addr is really 10.0.8.6:0/716350435 (socket is 10.0.8.6:50592/0) > 2014-12-06 19:20:56.595773 7fceac82f700 0 log [WRN] : 3 slow requests, 3 included below; oldest blocked for > 30.241397 secs > 2014-12-06 19:20:56.595796 7fceac82f700 0 log [WRN] : slow request 30.241397 seconds old, received at 2014-12-06 19:20:26.354247: osd_sub_op(client.33776.0:743596 3.235 efa97e35/rb.0.819b.238e1f29.00000003f52f/head//3 [] v 380'3783 snapset=0=[]:[] snapc=0=[]) v11 currently started > 2014-12-06 19:20:56.595825 7fceac82f700 0 log [WRN] : slow request 30.240286 seconds old, received at 2014-12-06 19:20:26.355358: osd_sub_op(client.33776.0:743597 3.235 efa97e35/rb.0.819b.238e1f29.00000003f52f/head//3 [] v 380'3784 snapset=0=[]:[] snapc=0=[]) v11 currently started > 2014-12-06 19:20:56.595837 7fceac82f700 0 log [WRN] : slow request 30.177186 seconds old, received at 2014-12-06 19:20:26.418458: osd_sub_op(client.33776.0:743604 3.ba 5e76b8ba/rb.0.819b.238e1f29.00000003e628/head//3 [] v 380'6439 snapset=0=[]:[] snapc=0=[]) v11 currently started > ---- That these are started and nothing else suggests that they're probably waiting for one of the throttles to let them in, rather than themselves being particularly slow. > > The HDDs and SSDs are new, there's nothing in the pertinent logs or smart > that indicates any problem with that HDD or its journal SSD, nor the > system in general. > This problem persisted (and the VM remained stuck) until OSD 8 was > restarted the next day when I discovered this. > > I suppose this is another "this can't/shouldn't happen" case, but I'd be > delighted about any suggestions as to what happened here, potential > prevention measures and any insights on how to maybe coax more information > out of Ceph if this happens again. Nah, there are a million reasons stuff can be slow. It might just be a transient overload of the disk compared to the others. If you see this again while it's happening I'd check the perfcounters; if you're keeping historical checks of them go look at the blocked-up times and see if any of them are at or near their maximum values. -Gre _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com