On 17-12-14 05:31 PM, David Turner wrote:
I've tracked this in a much more manual way. I would grab a random subset
[..]
This was all on a Hammer cluster. The changes to the snap trimming queues
going into the main osd thread made it so that our use case was not viable
on Jewel until changes to Jewel that happened after I left. It's exciting
that this will actually be a reportable value from the cluster.
Sorry that this story doesn't really answer your question, except to say
that people aware of this problem likely have a work around for it. However
I'm certain that a lot more clusters are impacted by this than are aware of
it and being able to quickly see that would be beneficial to troubleshooting
problems. Backporting would be nice. I run a few Jewel clusters that have
some VM's and it would be nice to see how well the cluster handle snap
trimming. But they are much less critical on how much snapshots they do.
Thanks for your response, it pretty much confirms what I though:
- users aware of issue have their own hacks that don't need to be efficient
or convenient.
- users unaware of issue are, well, unaware and at risk of serious service
disruption once disk space is all used up.
Hopefully it'll be convincing enough for devs. ;)
--
Piotr Dałek
piotr.dalek@xxxxxxxxxxxx
https://www.ovh.com/us/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com