On Mon, Mar 6, 2017 at 3:03 PM, Daniel Davidson <danield at igb.illinois.edu> wrote: > Thanks for the suggestion, however I think my more immediate problem is the > ms_handle_reset messages. I do not think the mds are getting the updates > when I send them. I wouldn't assume that. You can check the current config state to see that your values got through by using "ceph daemon mds.<id> config show". John > > Dan > > > On 03/04/2017 09:08 AM, John Spray wrote: >> >> On Fri, Mar 3, 2017 at 9:48 PM, Daniel Davidson >> <danield at igb.illinois.edu> wrote: >>> >>> ceph daemonperf mds.ceph-0 >>> -----mds------ --mds_server-- ---objecter--- -----mds_cache----- >>> ---mds_log---- >>> rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs >>> evts >>> subm| >>> 0 336k 97k| 0 0 0 | 0 0 20 | 0 0 246k 0 | 31 >>> 27k >>> 0 >>> 0 336k 97k| 0 0 0 |112 0 20 | 0 0 246k 55 | 31 >>> 26k >>> 55 >>> 0 336k 97k| 0 1 0 | 90 0 20 | 0 0 246k 45 | 31 >>> 26k >>> 45 >>> 0 336k 97k| 0 0 0 | 2 0 20 | 0 0 246k 1 | 31 >>> 26k >>> 1 >>> 0 336k 97k| 0 0 0 |166 0 21 | 0 0 246k 83 | 31 >>> 26k >>> 83 >>> >>> I have too many strays that seem to be causing disk full errors when >>> deleting man files (hundreds of thousands) the number here is down from >>> over 400k. I have been trying to up the number of processes to do this, >>> but >>> it is not happening: >>> >>> ceph tell mds.ceph-0 injectargs --mds-max-purge-ops-per-pg 2 >>> 2017-03-03 15:44:00.606548 7fd96400a700 0 client.225772 ms_handle_reset >>> on >>> 172.16.31.1:6800/55710 >>> 2017-03-03 15:44:00.618556 7fd96400a700 0 client.225776 ms_handle_reset >>> on >>> 172.16.31.1:6800/55710 >>> mds_max_purge_ops_per_pg = '2' >>> >>> ceph tell mds.ceph-0 injectargs --mds-max-purge-ops 16384 >>> 2017-03-03 15:45:27.256132 7ff6d900c700 0 client.225808 ms_handle_reset >>> on >>> 172.16.31.1:6800/55710 >>> 2017-03-03 15:45:27.268302 7ff6d900c700 0 client.225812 ms_handle_reset >>> on >>> 172.16.31.1:6800/55710 >>> mds_max_purge_ops = '16384' >>> >>> I do have a backfill running as I also have a new node that is almost >>> done. >>> Any ideas as to what is going on here? >> >> Try also increasing mds_max_purge_files. If your files are small >> then that is likely to be the bottleneck. >> >> John >> >>> Dan >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users at lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >