Hello,
--
I've created a Ceph cluster with 3 nodes and a FS to serve a webpage. The webpage speed is good enough (near to NFS speed), and have HA if one FS die.
My problem comes when I deploy a git repository on that FS. The server makes a lot of IOPS to check the files that have to update and then all clients starts to have problems to use the FS (it becomes much slower).
In a normal usage the web takes about 400ms to load, and when the problem start it takes more than 3s. To fix the problem I just have to remount the FS on clients, but I can't remount the FS on every deploy...
While is deploying I see how the CPU on MDS is a bit higher, but when it ends the CPU usage goes down again, so look like is not a problem of CPU.
My config file is:
[global]
fsid = bf56854......e611c08
mon_initial_members = fs-01, fs-02, fs-03
mon_host = 10.50.0.94,10.50.1.216,10.50.2.52
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public network = 10.50.0.0/22
osd pool default size = 3
##
### OSD
##
[osd]
osd_mon_heartbeat_interval = 5
osd_mon_report_interval_max = 10
osd_heartbeat_grace = 15
osd_fast_fail_on_connection_refused = True
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
osd_pool_default_size = 2
osd_pool_default_min_size = 2
##
### Monitores
##
[mon]
mon_osd_min_down_reporters = 1
##
### MDS
##
[mds]
mds_cache_memory_limit = 792723456
mds_bal_mode = 1
##
### Client
##
[client]
client_cache_size = 32768
client_mount_timeout = 30
client_oc_max_objects = 2000
client_oc_size = 629145600
client_permissions = false
rbd_cache = true
rbd_cache_size = 671088640
My cluster and clients uses Debian 9 with latest ceph version (12.2.4). The clients uses kernel modules to mount the share, because are a bit faster than fuse modules. The deploy is done on one of the Ceph nodes, that have the FS mounted by kernel module too.
My cluster is not a high usage cluster, so have all daemons on one machine (3 machines with OSD, MON, MGR and MDS). All OSD has a copy of the data, only one MGR is active and two of the MDS are active with one on standby. The clients mount the FS using the three MDS IP addresses and just now don't have any request because is not published.
Someone knows what can be happening?, because all works fine (even on other cluster I did with an high load), but just deploy the git repository and all start to work very slow.
Thanks!!
_________________________________________
Daniel Carrasco Marín
Ingeniería para la Innovación i2TIC, S.L.
Tlf: +34 911 12 32 84 Ext: 223
www.i2tic.com
_________________________________________
Ingeniería para la Innovación i2TIC, S.L.
Tlf: +34 911 12 32 84 Ext: 223
www.i2tic.com
_________________________________________
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com