Hi all, Has anyone else noticed any p99.99+ tail latency regression for RBD workloads in Quincy vs. pre-Pacific, i.e., before the kv_onode cache existed? Some notes from what I have seen thus far: * Restarting OSDs temporarily resolves the problem... then as activity accrues over time, the problem becomes appreciably worse * In comparing profiles of running OSDs, I've noticed that the bluestore block allocators are comparatively more active than in old releases (even though the fragmentation scores of the Quincy OSDs are far better in this case) * The new kv_onode cache often looks like it is often bursting at the seams, whereas the kv/meta/data caches have breathing room I am becoming increasingly confident that the observations are related, though I have not dived enough into bluestore to reason about how/when onodes are allocated on disk to complete the circle. Anyways, I am posting this to see if perhaps the defaults for the priority cache for the new kv_onode slab needs a slight nudge. You can observe them on OSDs with debug_bluestore 20/20 for a second and grepping for cache_size. Cheers, Tyler _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx