I've been puzzling over a failure in teuthology where lockdeps were
enabled and reported a lock cycle. The output of the found cycle is
below. I think the issue is actually erroneous, as it reports a found
cycle, but the two dependencies that cause the cycle occur in separate
threads. Its correctly detecting a possible deadlock due to out of
order locking (thread1: a -> b, thread2: b -> a), but in this case I
don't think the deadlock is possible, because the two threads never run
at the same time.
My proposed fixes are in wip-lockdep-fixes. It resolves the issue of
stomping on thread ids by using gettid() instead of pthread_self(), and
ensures that the cycle happens within the same thread. It also allows
the g_lockdep field to be set to 3, which will detect (and only warn) on
possible deadlock cases across threads.
-sam
------------------------------------
existing dependency Client::client_lock (10) -> SimpleMessenger::lock
(4) at:
ceph version 0.55-217-g331c250 (331c25046ecd99ec10c5835e8e674ca819e6168a)
1: (Client::init()+0xbbd) [0x7f0303a9d83d]
2: (ceph_mount_info::mount(std::string const&)+0x191) [0x7f0303a772b1]
3: (ceph_mount()+0x76) [0x7f0303a75bc6]
4: (LibCephFS_Open_empty_component_Test::TestBody()+0x4e1) [0x432cc1]
5: (testing::Test::Run()+0xaa) [0x46089a]
6: (testing::internal::TestInfoImpl::Run()+0x100) [0x4609a0]
7: (testing::TestCase::Run()+0xbd) [0x460a6d]
8: (testing::internal::UnitTestImpl::RunAllTests()+0x217) [0x460cd7]
9: (main()+0x35) [0x41c4d5]
10: (__libc_start_main()+0xed) [0x7f030309176d]
11: test_libcephfs() [0x41c531]
2012-12-10 19:31:30.231305 7f02c8ff9700 0 new dependency
SimpleMessenger::lock (4) -> Client::client_lock (10) creates a cycle at
ceph version 0.55-217-g331c250 (331c25046ecd99ec10c5835e8e674ca819e6168a)
1: (ObjectCacher::FlusherThread::entry()+0x15) [0x7f0303d98005]
2: (Thread::_entry_func(void*)+0x12) [0x7f0303c908d2]
3: (()+0x7e9a) [0x7f0304a32e9a]
4: (clone()+0x6d) [0x7f03031624bd]
2012-12-10 19:31:30.231332 7f02c8ff9700 0 btw, i am holding these locks:
2012-12-10 19:31:30.231334 7f02c8ff9700 0 SimpleMessenger::lock (4)
2012-12-10 19:31:30.231335 7f02c8ff9700 0
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html