Re: snapshot removal slows cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/26/17 14:54, Vladimir Prokofev wrote:
Hello ceph-users.

Short description: during snapshot removal osd usilisation goes up to 100%, which leads to slow requests and VM failures due to IOPS stall.

We're using Openstack Cinder with CEPH cluster as a volume backend. CEPH version is 10.2.6.
We also using cinder-backup to create backups of those volumes in CEPH, which uses snapshot and layering features I guess.
Cluster consists of 5 OSD nodes with mixed SSD/HDD storage, separate SSD for HDD journals, separate 10Gb/s public and private networks, 3 MON nodes. We also have a single "backup" node which is responsible for "backups" pool, handled by CRUSH map rules.

While creating backup everything looks good. Backup node is overwhelmed with load, but that's to be expected. Problem begins when we start deleting old backups.
While old backup is deleted, utilization of main nodes OSDs skyrockets up to 100%. This leads to slow requests in main storage pools, which, given enough time, can lead to a process hang, or at least SCSI reset attempts, and in worst cases - VM hangs.

I'm looking for a solution to avoid this issue.

So far I understand that I don't know how CEPH snapshot mechanics work at all, because I can't figure why deleting a backup leads to requests not to backup OSDs, where backup data is really stored, but rather to main OSDs, where original objects reside. Is there any good doc on this?

Googling shows that I'm not the first one to encounter this issue, but I cound't find any exact solution anywhere. Here's a short list of ideas:
 - use osd snap trim priority = 1. This is reported as not as helpfull, as this is already lower than client IO priority = 63;
 - use osd_snap_trim_sleep, but as far as I see it's broken in jewel, and will only be fixed in 10.2.8 - http://tracker.ceph.com/issues/19328;
 - disabling fast-diff and object map features seem to help, but I'm not sure what are the tradeoffs for this scenario.

Sounds like you researched it well, but you missed the most important setting:

    osd_pg_max_concurrent_snap_trims=1

(default is 2)

Also somebody said this might help by doing directory splitting less often (but maybe more work at once):

    filestore_split_multiple = 8

(default is 2)


I'll appreciate any ideas on how to fix this.


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux