After our conversation this mornning I went through the locking a bit more and I think we're in okay shape. To summarize: These 'forward lookup' structures are protected by coll->lock: Bnode::blob_map -> Blob coll->lock Onode::blob_map -> Blob coll->lock These ones are protected by cache->lock: Collection::OnodeSpace::onode_map -> Onode (unordered_map) cache->lock Blob::bc -> Buffer cache->lock The BnodeSet is a bit different because it is depopulated when the last onode ref goes away. But it has its own lock: Collection::BnodeSet::uset -> Bnode (intrustive set) BnodeSet::lock Anyway, the point of this is that the cache trim() can do everything it needs with just cache->lock. That means that during an update, we normally have coll->lock to protect the structures we're touching, and if we are adding onodes to OnodeSpace or BufferSpace we additionally take cache->lock for the appropriate cache fragment. We were getting tripped up from the blob_map iteration in _txc_state_proc because we were holding no locks at all (or, previously, a collection lock that may or may not be the right one). Igor's PR fixes this by making Blob refcounted and keeping a list of these. The finish_write() function takes cache->lock as needed. Also, it's worth pointing out that the blobs that we're touching will all exist under an Onode that is in the onodes list, and it turns out that the trim() is already doing the right thing and not trimming Onodes that still have any refs. Which leaves me a bit confused as to how we originally were crashing, because we were taking the first_collection lock. My guess is that first_collection was not the right collection and a racing update was updating the Onode. The refcounting alone sorts this out. My other fix would have also resolved it by taking the correct collection's lock, I think. Unless there is another locking problem I'm not seeing.. but I think what is in master now has it covered. Igor, Somnath, does the current strategy make sense? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html