I spent a bunch of today looking at http://tracker.ceph.com/issues/12297. Long story short: the workload is doing a readdir at the same time as it's unlinking files. The readdir functions (in this case, _readdir_cache_cb) drop the client_lock each time they invoke the callback (for obvious reasons). There is some effort in _readdir_cache_cb to try and keep the iterator valid (we check on each loop that we aren't at end; we increment the iterator before dropping the lock), but it's not sufficient. Is there supposed to be something preventing this kind of race? If not I can work something out in the code but I've not done much work in that bit and there are enough pieces that I wonder if I'm missing some other issue. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html