Hi, since we upgraded to Luminous we have had an issue with snapshot deletion that could be related: when a largish (a few TB) snapshot gets deleted we see a spike in the load of the OSD daemon followed by a brief flap of the daemons themselves. It seems that while the snapshot would have been deleted according to rbd the trimming process would just stop because of the flapping, leaving the PGs in an inconsistent state. Such inconsistency would be later detected during the scrub process in the case of a snapshot but I wonder if something similar happens during the deletion process it would be undetected by the mons/mgr. We fixed the issue changing "osd_snap_trim_sleep" from "0" to "2.0" but I am still pondering whether "osd_delete_sleep" should be changed as well and the issue you are reporting could be an indication of that. Best Mattia On 1/15/20 10:16 AM, 徐蕴 wrote: > No every volume. It seems that volumes with high capacity have higher probability to trigger this problem. > >> 2020年1月15日 下午4:28,Eugen Block <eblock@xxxxxx> 写道: >> >> Then it's probably something different. Does that happen with every volume/image or just this one time? >> >> >> Zitat von 徐蕴 <yunxu@xxxxxx>: >> >>> Hi Eugen, >>> >>> Thank you for sharing your experience. I will dig into OpenStack cinder logs to check if something happened. The strange thing is the volume I deleted is not created from a snapshot, or doesn’t have any snapshot. And the rbd_id.xxx, rbd_header.xxx and rbd_object_map.xxx were deleted, just left out a lot of rbd_data objects. I plan to delete those objects manually. >>> >>> br, >>> Xu Yun >>> >>>> 2020年1月15日 下午3:50,Eugen Block <eblock@xxxxxx> 写道: >>>> >>>> Hi, >>>> >>>> this might happen if you try to delete images/instances/volumes in openstack that are somehow linked, e.g. if there are snapshots etc. I have experienced this in Ocata, too. Deleting a base image worked but there were existing clones so basically just the openstack database was updated, but the base image still existed within ceph. >>>> >>>> Try to figure out if that is also the case. If it's something else, check the logs in your openstack environment, maybe they reveal something. Also check the ceph logs. >>>> >>>> Regards, >>>> Eugen >>>> >>>> >>>> Zitat von 徐蕴 <yunxu@xxxxxx>: >>>> >>>>> Hello, >>>>> >>>>> My setup is Ceph pike working with OpenStack. When I deleted an image, I found that the space was not reclaimed. I checked with rbd ls and confirmed that this image was disappeared. But when I check the objects with rados ls, most objects named rbd_data.xxx are still existed in my cluster. rbd_object_map and rbd_header were already deleted. I waited for several hours and there is no further deletion happed. Is it a known issue, or something wrong with my configuration? >>>>> >>>>> br, >>>>> Xu Yun >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Mattia Belluco S3IT Services and Support for Science IT Office Y11 F 52 University of Zürich Winterthurerstrasse 190, CH-8057 Zürich (Switzerland) Tel: +41 44 635 42 22 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx