Re: Increasing pg_num

Christian Balzer <chibi@xxxxxxx> · Tue, 17 May 2016 13:27:52 +0900



Hello,

On Tue, 17 May 2016 12:12:02 +1000 Chris Dunlop wrote:

> Hi Christian,
> 
> On Tue, May 17, 2016 at 10:41:52AM +0900, Christian Balzer wrote:
> > On Tue, 17 May 2016 10:47:15 +1000 Chris Dunlop wrote:
> > Most your questions would be easily answered if you did spend a few
> > minutes with even the crappiest test cluster and observing things (with
> > atop and the likes). 
> 
> You're right of course. I'll set up a test cluster and start
> experimenting, which I should have done before asking questions here.
> 
> > To wit, this is a test pool (12) created with 32 PGs and slightly
> > filled with data via rados bench:
> > ---
> > # ls -la /var/lib/ceph/osd/ceph-8/current/ |grep "12\."
> > drwxr-xr-x   2 root root  4096 May 17 10:04 12.13_head
> > drwxr-xr-x   2 root root  4096 May 17 10:04 12.1e_head
> > drwxr-xr-x   2 root root  4096 May 17 10:04 12.b_head
> > # du -h /var/lib/ceph/osd/ceph-8/current/12.13_head/
> > 121M    /var/lib/ceph/osd/ceph-8/current/12.13_head/
> > ---
> > 
> > After increasing that to 128 PGs we get this:
> > ---
> > # ls -la /var/lib/ceph/osd/ceph-8/current/ |grep "12\."
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.13_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.1e_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.2b_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.33_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.3e_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.4b_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.53_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.5e_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.6b_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.73_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.7e_head
> > drwxr-xr-x   2 root root  4096 May 17 10:18 12.b_head
> > # du -h /var/lib/ceph/osd/ceph-8/current/12.13_head/
> > 25M     /var/lib/ceph/osd/ceph-8/current/12.13_head/
> > ---
> > 
> > Now this was fairly uneventful even on my crappy test cluster, given
> > the small amount of data (which was mostly cached) and the fact that
> > it's idle.
> > 
> > However consider this with 100's of GB per PG and a busy cluster and
> > you get the idea where massive and very disruptive I/O comes from.
> 
> Per above, I'll experiment with this, but my first thought is I suspect
> that's moving object/data files around rather than copying data, so the
> overheads are in directory operations rather than data copies - not that
> directory operations are free either of course.
> 
That's correct, but given enough objects (and thus directory depths) and
most of all I/O contention in a busy cluster the impact is quite
pronounced.

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com