Le lundi 19 mars 2018 à 10:01 +0000, Sergey Malinin a écrit : > I experienced the same issue and was able to reduce metadata writes > by raising mds_log_events_per_segment to > it’s original value multiplied several times. I changed it from 1024 to 4096 : * rsync status (1 line per file) scrolls much quicker * OSD writes on the dashboard is much lower than reads now (it was much higher before) * metadata pool write rate in the 20-800kBps range now, while metadata reads in the 20-80kBps * data pool reads is in the hundreds of kBps, which still seems very low * destination disk write rate is a bit larger than the data pool read rate (expected for btrfs), but still low * inter-DC network load is now 1-50Mbps I'll monitor the Munin graphs in the long run. I can't find any doc about that mds_log_events_per_segment setting, specially on how to choose a good value. Can you elaborate on "original value multiplied several times" ? I'm just seeing more MDS_TRIM warnings now. Maybe restarting the MDSs just delayed re-emergence of the initial problem. > ________________________________ > From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of > Nicolas Huillard <nhuillard@xxxxxxxxxxx> > Sent: Monday, March 19, 2018 12:01:09 PM > To: ceph-users@xxxxxxxxxxxxxx > Subject: Huge amount of cephfs metadata writes while > only reading data (rsync from storage, to single disk) > > Hi all, > > I'm experimenting with a new little storage cluster. I wanted to take > advantage of the week-end to copy all data (1TB, 10M objects) from > the > cluster to a single SATA disk. I expected to saturate the SATA disk > while writing to it, but the storage cluster actually saturates its > network links, while barely writing to the destination disk (63GB > written in 20h, that's less than 1MBps). > > Setup : 2 datacenters × 3 storage servers × 2 disks/OSD each, > Luminous > 12.2.4 on Debian stretch, 1Gbps shared network, 200Mbps fibre link > between datacenters (12ms latency). 4 clients using a single cephfs > storing data + metadata on the same spinning disks with bluestore. > > Test : I'm using a single rsync on one of the client servers (the > other > 3 are just sitting there). rsync is local to the client, copying from > the cephfs mount (kernel client on 4.14 from stretch-backports, just > to > use a potentially more recent cephfs client than on stock 4.9), to > the > SATA disk. The rsync'ed tree consists of lots a tiny files (1-3kB) on > deep directory branches, along with some large files (10-100MB) in a > few directories. There is no other activity on the cluster. > > Observations : I initially saw write performance on the destination > disk from a few 100kBps (during exploration of branches with tiny > file) > to a few 10MBps (while copying large files), essentially seeing the > file names scrolling at a relatively fixed rate, unrelated to their > individual size. > After 5 hours, the fibre link stated to saturate at 200Mbps, while > destination disk writes is down to a few 10kBps. > > Using the dashboard, I see lots of metadata writes, at 30MBps rate on > the metadata pool, which correlates to the 200Mbps link rate. > It also shows regular "Health check failed: 1 MDSs behind on trimming > (MDS_TRIM)" / "MDS health message (mds.2): Behind on trimming > (64/30)". > > I wonder why cephfs would write anything to the metadata (I'm > mounting > on the clients with "noatime"), while I'm just reading data from > it... > What could I tune to reduce that write-load-while-reading-only ? > > -- > Nicolas Huillard > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Nicolas Huillard Associé fondateur - Directeur Technique - Dolomède nhuillard@xxxxxxxxxxx Fixe : +33 9 52 31 06 10 Mobile : +33 6 50 27 69 08 http://www.dolomede.fr/ https://reseauactionclimat.org/planetman/ http://climat-2020.eu/ http://www.350.org/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com