Re: Cluster unusable after 50% full, even with index sharding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Fri, 13 Apr 2018 11:59:01 -0500 Robert Stanford wrote:

>  I have 65TB stored on 24 OSDs on 3 hosts (8 OSDs per host).  SSD journals
> and spinning disks.  Our performance before was acceptable for our purposes
> - 300+MB/s simultaneous transmit and receive.  Now that we're up to about
> 50% of our total storage capacity (65/120TB, say), the write performance is
> still ok, but the read performance is unworkable (35MB/s!)
> 
As always, full details.
Versions, HW, what SSDs, what HDDs and how connected, what FS on the
OSDs, etc.
 
>  I am using index sharding, with 256 shards.  I don't see any CPUs
> saturated on any host (we are using radosgw by the way, and the load is
> light there as well).  The hard drives don't seem to be *too* busy (a
> random OSD shows ~10 wa in top).  The network's fine, as we were doing much
> better in terms of speed before we filled up.
>
top is an abysmal tool for these things, use atop in a big terminal window
on all 3 hosts for full situational awareness.
"iostat -x 3" might do in a pinch for IO related bits, too.

Keep in mind that a single busy OSD will drag the performance of the whole
cluster down. 

Other things to check and verify:
1. Are the OSDs reasonably balanced PG wise?
2. How fragmented are the OSD FS?
3. Is a deep scrub running during the low performance times?
4. Have you run out of RAM for the pagecache and more importantly the SLAB
for dir_entries due to the number of objects (files)? 
If so reads will require many more disk accesses than otherwise.  
This is a typical wall to run into and can be mitigated by more RAM and
sysctl tuning. 

Christian
 
>   Is there anything we can do about this, short of replacing hardware?  Is
> it really a limitation of Ceph that getting 50% full makes your cluster
> unusable?  Index sharding has seemed to not help at all (I did some
> benchmarking, with 128 shards and then 256; same result each time.)
> 
>  Or are we out of luck?


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Rakuten Communications
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux