Re: Intermittent poor performance on 3 node cluster

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 21 Oct 2013 13:45:19 -0700

On Mon, Oct 21, 2013 at 8:05 AM, Pieter Steyn <pieter@xxxxxxxxxx> wrote:
> Hi all,
>
> I'm using Ceph as a filestore for my nginx web server, in order to have
> shared storage, and redundancy with automatic failover.
>
> The cluster is not high spec, but given my use case (lots of images) - I am
> very dissapointed with the current throughput I'm getting, and was hoping
> for some advice.
>
> I'm using CephFS and the latest Dumpling version on Ubuntu Server 12.04
>
> Server specs:
>
> CephFS1, CephFS2:
>
> Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
> 12GB Ram
> 1x 2TB SATA XFS
> 1x 2TB SATA (For the journal)
>
> Each server runs 1x OSD, 1x MON and 1x MDS.
> A third server runs 1x MON for Paxos to work correctly.
> All machines are connected via a gigabit switch.
>
> The ceph config as follows:
>
> [global]
> fsid = 58b87152-5ce8-491e-ae9c-07caeea3fefb
> mon_initial_members = lb1, cephfs1, cephfs2
> mon_host = 192.168.1.58,192.168.1.70,192.168.1.72
> auth_supported = cephx
> osd_journal_size = 1024
> filestore_xattr_use_omap = true
>
> Osd dump:
>
> epoch 750
> fsid 58b87152-5ce8-491e-ae9c-07caeea3fefb
> created 2013-09-12 13:13:02.695411
> modified 2013-10-21 14:28:31.780838
> flags
>
> pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
> pg_num 64 pgp_num 64 last_change 1 owner 0 crash_replay_interval 45
> pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash rjenkins
> pg_num 64 pgp_num 64 last_change 1 owner 0
> pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 2 object_hash rjenkins
> pg_num 64 pgp_num 64 last_change 1 owner 0
>
> max_osd 4
> osd.0 up   in  weight 1 up_from 741 up_thru 748 down_at 739
> last_clean_interval [614,738) 192.168.1.70:6802/12325
> 192.168.1.70:6803/12325 192.168.1.70:6804/12325 192.168.1.70:6805/12325
> exists,up d59119d5-bccb-43ea-be64-9d2272605617
> osd.1 up   in  weight 1 up_from 748 up_thru 748 down_at 745
> last_clean_interval [20,744) 192.168.1.72:6800/4271 192.168.1.72:6801/4271
> 192.168.1.72:6802/4271 192.168.1.72:6803/4271 exists,up
> 930c097a-f68b-4f9c-a6a1-6787a1382a41
>
> pg_temp 0.12 [1,0,3]
> pg_temp 0.16 [1,0,3]
> pg_temp 0.18 [1,0,3]
> pg_temp 1.11 [1,0,3]
> pg_temp 1.15 [1,0,3]
> pg_temp 1.17 [1,0,3]
>
> Slowdowns increase the load of my nginx servers to around 40, and access to
> the CephFS mount is incredibly slow.  These slowdowns happen about once a
> week.  I typically solve them by restarting the MDS.
>
> When the cluster gets slow I see the following in my logs:
>
> 2013-10-21 14:33:54.079200 7f6301e10700  0 log [WRN] : slow request
> 30.281651 seconds old, received at 2013-10-21 14:33:23.797488:
> osd_op(mds.0.8:16266 100004094c4.00000000 [tmapup 0~0] 1.91102783 e750) v4
> currently commit sent
> 013-10-21 14:33:54.079191 7f6301e10700  0 log [WRN] : 6 slow requests, 6
> included below; oldest blocked for > 30.281651 secs

If this is the sole kind of slow request you see (tmapup reports),
then it looks like the MDS is flushing out directory updates and the
OSD is taking a long time to process them. I'm betting you have very
large directories and it's taking the OSD a while to process the
changes; and the MDS is getting backed up while it does so because
it's trying to flush them out of memory.

> Any advice? Would increasing the PG num for data and metadata help? Would
> moving the MDS to a host which does not also run an OSD be greatly
> beneficial?

Your PG counts are probably fine for a cluster of that size, although
you could try bumping them up by 2x or something. More likely, though,
is that your CephFS install is not well-tuned for the directory sizes
you're using. What's the largest directory you're using? Have you
tried bumping up your mds cache size? (And what's the host memory
usage look like?)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com