On Tue, Mar 13, 2018 at 11:35 AM, Sage Weil <sweil@xxxxxxxxxx> wrote: > Radoslaw noticed that the nlock field in common/Mutex.h was slowing us > down by 5% (and was basically useless). Also, all of the conditional > behavior based on constructor argumenst were fixed at compile time. > > This points (again) at the fact that we have our own mutex wrapper that > we're using instead of std::mutex. Chatted a bit with Josh and here is > our thinking: > > - Someday we could replace our lockdep with helgrind, but there is a lot > of work needed to get there. We had to stop running librbd through helgrind since it was fundamentally broken w/ the use of static lock initializers. It would just internally crash when it didn't realize that a lock was freed and the memory address was later re-used for a different lock. > - Right now, our lockdep provides a lot of value because it is enabled by > default with vstart clusters, so we catch most ordering issues there. (We > also have some teuthology tests that enable it.) > > - We could convert our Mutex class to conform to the c++ Mutex concept. > This mainly means renaming Lock -> lock, TryLock -> try_lock, Unlock -> > unlock. That would let us use things like std::unique_lock<> and > std::lock_guard<> with it. Cond would need a similar makeover. > > - We can rename it (or alias it) to something like debug_mutex. > > - We can alias ceph::mutex to either std::mutex or ceph::debug_mutex at > compile time. Production builds can exclude our lockdep and use > std::mutex exclusively. Default dev/vstart builds can use debug_mutex. > Maybe notcmalloc flavor uses debug_mutex and default flavor builds use > std::mutex, so that we have both available for teuthology tests. > > - All of our existing code can be converted to use ceph::mutex and > ceph::condition_variable instead of Mutex and Cond. > > - Profit! > > How does that sound? While fixing a bug in the fio RBD engine last week, I spent some time generating and looking at perf on and off CPU flamegraphs for small IO workloads. I never saw "nlock" taking up any time in the graphs for librbd/librados, but for librbd::AioCompletion and librados::AioCompletion, I did see ~5% of the time being wasted on string constructors for the lock name. I have a branch where I switched just librbd::AioCompletion and librados::AioCompletion to use std::mutex since the value of lockdep seemed pretty minimal (if not non-existent) for those two cases. > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jason -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html