Re: Snapshot cleanup performance impact on client I/O?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"This same behavior can be seen when deleting an RBD that has 100,000 objects vs 200,000 objects, it takes twice as long"

Correction, it will take a squared amount of time, but that's not really the most important part of the response.

On Fri, Jun 30, 2017 at 4:24 PM David Turner <drakonstein@xxxxxxxxx> wrote:
When you delete a snapshot, Ceph places the removed snapshot into a list in the OSD map and places the objects in the snapshot into a snap_trim_q.  Once those 2 things are done, the RBD command returns and you are moving onto the next snapshot.  The snap_trim_q is an n^2 operation (like all deletes in Ceph), which means that if the queue has 100 objects on it and takes 5 minutes to complete, then having 200 objects in the queue will take 25 minutes. (exaggerated time frames to show math)  This same behavior can be seen when deleting an RBD that has 100,000 objects vs 200,000 objects, it takes twice as long (note that object map mitigates this greatly by ignoring any object that hasn't been created, so the previous test would be easiest to duplicate by disabling the object map on the test RBDs).

So paying attention to snapshot sizes as you clean them up is more important than how many snapshots you clean up.  Being on Jewel, you don't really want to use osd_snap_trim_sleep as it literally puts a sleep onto the main op threads for the OSD.  In Hammer this setting was much more useful (if not super hacky) and in Luminous the entire process was revamped and (hopefully) fixed.  Jewel is pretty much not viable for large quantities of snapshots, but there are ways to get through them.

The following thread on the ML is one of the most informative on this problem in Jewel.  The second link is the resuming of the thread months later after the fix was scheduled for backporting into 10.2.8.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-January/015675.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017697.html

On Fri, Jun 30, 2017 at 4:02 PM Kenneth Van Alstyne <kvanalstyne@xxxxxxxxxxxxxxx> wrote:
Hey folks:
        I was wondering if the community can provide any advice — over time and due to some external issues, we have managed to accumulate thousands of snapshots of RBD images, which are now in need of cleaning up.  I have recently attempted to roll through a “for" loop to perform a “rbd snap rm” on each snapshot, sequentially, waiting until the rbd command finishes before moving onto the next one, of course.  I noticed that shortly after starting this, I started seeing thousands of slow ops and a few of our guest VMs became unresponsive, naturally.

My questions are:
        - Is this expected behavior?
        - Is the background cleanup asynchronous from the “rbd snap rm” command?
                - If so, are there any OSD parameters I can set to reduce the impact on production?
        - Would “rbd snap purge” be any different?  I expect not, since fundamentally, rbd is performing the same action that I do via the loop.

Relevant details are as follows, though I’m not sure cluster size *really* has any effect here:
        - Ceph: version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
        - 5 storage nodes, each with:
                - 10x 2TB 7200 RPM SATA Spindles (for a total of 50 OSDs)
                - 2x Samsung MZ7LM240 SSDs (used as journal for the OSDs)
                - 64GB RAM
                - 2x Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
                - 20GBit LACP Port Channel via Intel X520 Dual Port 10GbE NIC

Let me know if I’ve missed something fundamental.

Thanks,

--
Kenneth Van Alstyne
Systems Architect
Knight Point Systems, LLC
Service-Disabled Veteran-Owned Business
1775 Wiehle Avenue Suite 101 | Reston, VA 20190
c: 228-547-8045 f: 571-266-3106
www.knightpoint.com
DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
GSA Schedule 70 SDVOSB: GS-35F-0646S
GSA MOBIS Schedule: GS-10F-0404Y
ISO 20000 / ISO 27001 / CMMI Level 3

Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, copy, use, disclosure, or distribution is STRICTLY prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux