Re: s/Mutex/ceph::mutex/, lockdep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 13, 2018 at 11:35 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> Radoslaw noticed that the nlock field in common/Mutex.h was slowing us
> down by 5% (and was basically useless).  Also, all of the conditional
> behavior based on constructor argumenst were fixed at compile time.
>
> This points (again) at the fact that we have our own mutex wrapper that
> we're using instead of std::mutex.  Chatted a bit with Josh and here is
> our thinking:
>
> - Someday we could replace our lockdep with helgrind, but there is a lot
> of work needed to get there.

We had to stop running librbd through helgrind since it was
fundamentally broken w/ the use of static lock initializers. It would
just internally crash when it didn't realize that a lock was freed and
the memory address was later re-used for a different lock.

> - Right now, our lockdep provides a lot of value because it is enabled by
> default with vstart clusters, so we catch most ordering issues there.  (We
> also have some teuthology tests that enable it.)
>
> - We could convert our Mutex class to conform to the c++ Mutex concept.
> This mainly means renaming Lock -> lock, TryLock -> try_lock, Unlock ->
> unlock.  That would let us use things like std::unique_lock<> and
> std::lock_guard<> with it.  Cond would need a similar makeover.
>
> - We can rename it (or alias it) to something like debug_mutex.
>
> - We can alias ceph::mutex to either std::mutex or ceph::debug_mutex at
> compile time.  Production builds can exclude our lockdep and use
> std::mutex exclusively.  Default dev/vstart builds can use debug_mutex.
> Maybe notcmalloc flavor uses debug_mutex and default flavor builds use
> std::mutex, so that we have both available for teuthology tests.
>
> - All of our existing code can be converted to use ceph::mutex and
> ceph::condition_variable instead of Mutex and Cond.
>
> - Profit!
>
> How does that sound?

While fixing a bug in the fio RBD engine last week, I spent some time
generating and looking at perf on and off CPU flamegraphs for small IO
workloads. I never saw "nlock" taking up any time in the graphs for
librbd/librados, but for librbd::AioCompletion and
librados::AioCompletion, I did see ~5% of the time being wasted on
string constructors for the lock name. I have a branch where I
switched just librbd::AioCompletion and librados::AioCompletion to use
std::mutex since the value of lockdep seemed pretty minimal (if not
non-existent) for those two cases.

> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Jason
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux