Re: operate one file in multi clients with libceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage,

    I've test this patch.
With 8 times test, only 1 (the first time) got a thread assert fail.As
back trace 1 showed:
==========================  back trace 1   ==================================
(gdb) bt
#0  0x000000367fa30265 in raise () from /lib64/libc.so.6
#1  0x000000367fa31d10 in abort () from /lib64/libc.so.6
#2  0x0000003682ebec44 in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib64/libstdc++.so.6
#3  0x0000003682ebcdb6 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x0000003682ebcde3 in std::terminate() () from /usr/lib64/libstdc++.so.6
#5  0x0000003682ebceca in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00007ffff7c51a54 in ceph::__ceph_assert_fail
(assertion=0x7ffff7ce31f1 "in->cap_refs[(32 << 8)] == 1",
    file=0x7ffff7ce1431 "client/Client.cc", line=2190,
func=0x7ffff7ce5fc0 "void Client::_flush(Inode*, Context*)")
    at common/assert.cc:86
#7  0x00007ffff7b39bb0 in Client::_flush (this=0x629090, in=0x630f50,
onfinish=0x7fffe8000b60) at client/Client.cc:2190
#8  0x00007ffff7b52426 in Client::handle_cap_grant (this=0x629090,
in=0x630f50, mds=0, cap=0x62ee50, m=0x631fc0)
    at client/Client.cc:2930
#9  0x00007ffff7b52d2a in Client::handle_caps (this=0x629090,
m=0x631fc0) at client/Client.cc:2711
#10 0x00007ffff7b560a0 in Client::ms_dispatch (this=0x629090,
m=0x631fc0) at client/Client.cc:1444
#11 0x00007ffff7bcaff9 in Messenger::ms_deliver_dispatch
(this=0x628350, m=0x631fc0) at msg/Messenger.h:98
#12 0x00007ffff7bb2607 in SimpleMessenger::dispatch_entry
(this=0x628350) at msg/SimpleMessenger.cc:352
#13 0x00007ffff7b22641 in SimpleMessenger::DispatchThread::entry
(this=0x6287d8) at msg/SimpleMessenger.h:533
#14 0x00007ffff7b5aa04 in Thread::_entry_func (arg=0x6287d8) at
./common/Thread.h:41
#15 0x00000036802064a7 in start_thread () from /lib64/libpthread.so.0
#16 0x000000367fad3c2d in clone () from /lib64/libc.so.6
============================================================

In addition, when writing data for a long time, the write thread will
hang on a pthread_cond_wait, as back trace 2 showed:
==========================  back trace 2  ==================================
Thread 10 (Thread 0x43f87940 (LWP 19986)):
#0  0x000000312be0ab99 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x00007f147a8e763b in Cond::Wait (this=0x43f86b00,
mutex=@0x66d790) at ./common/Cond.h:46
#2  0x00007f147a86630d in Client::wait_on_list (this=0x66d430,
ls=@0x674df0) at client/Client.cc:2140
#3  0x00007f147a8771ff in Client::get_caps (this=0x66d430,
in=0x6749d0, need=4096, want=8192, got=0x43f86e8c, endoff=235470848)
    at client/Client.cc:1827
#4  0x00007f147a886003 in Client::_write (this=0x66d430, f=0x671510,
offset=235470336, size=512,
    buf=0x6452b0 'ï' <repeats 33 times>, ".001", 'ï' <repeats 165
times>...) at client/Client.cc:5055
#5  0x00007f147a886d73 in Client::write (this=0x66d430, fd=10,
    buf=0x6452b0 'ï' <repeats 33 times>, ".001", 'ï' <repeats 165
times>..., size=512, offset=235470336) at client/Client.cc:5007
#6  0x00007f147a85822d in ceph_write (fd=10, buf=0x6452b0 'ï' <repeats
33 times>, ".001", 'ï' <repeats 165 times>..., size=512,
    offset=235470336) at libceph.cc:322
#7  0x000000000042fa27 in TdcCephImpl::AsyncWrite (this=0x647c30,
offset=235470336, length=512,
    buf=0x6452b0 'ï' <repeats 33 times>, ".001", 'ï' <repeats 165
times>..., queue=@0x644b60, io=0x6456a0)
    at /opt/tsk/tdc-tapdisk/td-connector2.0/common/tdc_ceph_impl.cpp:191
#8  0x000000000042ffc5 in TdcCephImpl::AsyncProcess (this=0x647c30,
io=0x6456a0, queue=@0x644b60, netfd=5)
    at /opt/tsk/tdc-tapdisk/td-connector2.0/common/tdc_ceph_impl.cpp:268
#9  0x00000000004192be in SubmitRequest_Intra (arg=0x644b00) at
/opt/tsk/tdc-tapdisk/td-connector2.0/td_connector_server.cpp:390
#10 0x000000312be064a7 in start_thread () from /lib64/libpthread.so.0
#11 0x000000312b6d3c2d in clone () from /lib64/libc.so.6
============================================================

Seem not very safe with the patch.

So I applied this patch:
diff --git a/src/client/Client.cc b/src/client/Client.cc
index 7f7fb08..bf0997a 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -4934,7 +4934,6 @@ public:

 void Client::sync_write_commit(Inode *in)
 {
-  client_lock.Lock();
+ int r = client_lock.Lock();
  assert(unsafe_sync_write > 0);
  unsafe_sync_write--;

@@ -4947,8 +4946,6 @@ void Client::sync_write_commit(Inode *in)
  }

  put_inode(in);
+ if (r)
  client_lock.Unlock();
 }

 int Client::write(int fd, const char *buf, loff_t size, loff_t offset)


After some test, back trace 1 didn't appear, but back trace 2 still appear.
Hmm,, not safe also.

Thx!
Simon



2011/5/19 Sage Weil <sage@xxxxxxxxxxxx>:
> Hi Simon,
>
> On Wed, 18 May 2011, Simon Tian wrote:
>
>> Hi,
>>
>> Â Â Could any one give me an answer?
>>
>> Â ÂMy application need to open one file with two clients in different
>> hosts. One for write/read and one for read only. The client could be
>> developed based on libceph or librbd.
>> Â ÂI tried librbd, exception appear too.
>
> I opened a bug for this, http://tracker.newdream.net/issues/1097.
>
> Can you try the patch below? ÂI this may just be something missed in a
> locking rewrite way back when in a rare code path:
>
> Thanks!
> sage
>
>
>
> diff --git a/src/client/Client.cc b/src/client/Client.cc
> index 7f7fb08..bf0997a 100644
> --- a/src/client/Client.cc
> +++ b/src/client/Client.cc
> @@ -4934,7 +4934,6 @@ public:
>
> Âvoid Client::sync_write_commit(Inode *in)
> Â{
> - Âclient_lock.Lock();
> Â assert(unsafe_sync_write > 0);
> Â unsafe_sync_write--;
>
> @@ -4947,8 +4946,6 @@ void Client::sync_write_commit(Inode *in)
> Â }
>
> Â put_inode(in);
> -
> - Âclient_lock.Unlock();
> Â}
>
> Âint Client::write(int fd, const char *buf, loff_t size, loff_t offset)
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux