On Wed, Jun 5, 2013 at 11:35 PM, Artem Silenkov <artem.silenkov@xxxxxxxxx> wrote: > Good day! > > Thank you, but it's not clear for me what is a bottleneck here. > > - Hardware node - load average, disk IO > > - underlying file system problem on osd or disk bad. > > - ceph journal problem > > Ceph osd partition is a part of block device which has practically no load > > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > > sda 12,00 0,00 0,12 0 0 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sda 12,00 0,00 0,14 0 0 > > Disk with osd is good, just checked it and have good r/w speed with > appropriate iops and latency. > > But hardware node is working hard and have high load average. I fear that > ceph-osd process lack resources. Is there any way to fix it? May be raise > some kind of timeout when syncing or make this osd less weight or so? > > Or its better to move this osd to another server? If it's part of a block device, are the other OSDs also part of a block device, or do they have dedicated partitions? If its hardware node has a higher load average than the others that could certainly be involved. But all I can tell you from what you've given me is that the OSD is issuing a sync to the filesystem, and the filesystem is taking multiple minutes to return so eventually the OSD gives up and commits suicide. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com