RE: ceph issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yup. same good status here.  Thanks for the fix.
I also recommend merging to master.

On a side note, executing "fio --blocksize=10M" bring my cluster to HEALTH_WARN with 8 requests are blocked > 32 sec.  The cluster recovers from this situation only after I kill the "bad fio process"

Avner

> -----Original Message-----
> From: Marov Aleksey [mailto:Marov.A@xxxxxxxxxx]
> Sent: Monday, November 21, 2016 18:20
> To: Haomai Wang <haomai@xxxxxxxx>; Avner Ben Hanoch
> <avnerb@xxxxxxxxxxxx>
> Cc: Sage Weil <sweil@xxxxxxxxxx>; ceph-devel@xxxxxxxxxxxxxxx
> Subject: HA: ceph issue
> 
> It seems for me that your last patch fixed the problem. It works  fine with fio
> 2.13 and fio 2.15.  I think it may be merged in master.
> 
> Thanks a lot for your work. I'll do some performnace tests next.
> 
> Best Regards
> Alex Marov
> ________________________________________
> 
> 
> @Avner plz try again, I submit a new patch to fix leaks.
> 
> On Sun, Nov 20, 2016 at 10:29 PM, Avner Ben Hanoch
> <avnerb@xxxxxxxxxxxx> wrote:
> > Perhaps similar fix needed in additional places.
> > See my stack trace below (failed on same assert(sub < m_subsys.size()))
> >
> > --
> > #0  0x00007fffe55525f7 in __GI_raise (sig=sig@entry=6) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> > #1  0x00007fffe5553ce8 in __GI_abort () at abort.c:90
> > #2  0x00007fffe6dbbd47 in ceph::__ceph_assert_fail
> (assertion=assertion@entry=0x7fffe70599d8 "sub < m_subsys.size()",
> >     file=file@entry=0x7fffe7059688
> "/mnt/data/avnerb/rpmbuild/BUILD/ceph-11.0.2-1611-
> geb25965/src/log/SubsystemMap.h", line=line@entry=62,
> >     func=func@entry=0x7fffe7074040
> <_ZZN4ceph7logging12SubsystemMap13should_gatherEjiE19__PRETTY_FUNCT
> ION__> "bool ceph::logging::SubsystemMap::should_gather(unsigned int,
> int)")
> >     at /usr/src/debug/ceph-11.0.2-1611-geb25965/src/common/assert.cc:78
> > #3  0x00007fffe6cd215a in ceph::logging::SubsystemMap::should_gather
> (level=10, sub=27, this=<optimized out>) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/log/SubsystemMap.h:62
> > #4  0x00007fffe6e65865 in should_gather (level=10, sub=27, this=<optimized
> out>) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/net_handler.cc:180
> > #5  ceph::NetHandler::generic_connect (this=0x86dc18, addr=...,
> nonblock=nonblock@entry=false) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/net_handler.cc:174
> > #6  0x00007fffe6e65b17 in ceph::NetHandler::connect (this=<optimized
> out>, addr=...) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/net_handler.cc:198
> > #7  0x00007fffe700105c in RDMAConnectedSocketImpl::try_connect
> (this=this@entry=0x7fffbc000ef0, peer_addr=..., opts=...) at
> /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/rdma/RDMAConnectedSocketImpl.cc:111
> > #8  0x00007fffe6e68ed4 in RDMAWorker::connect (this=0x7fffa806e650,
> addr=..., opts=..., socket=0x7fffa00235b0) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/rdma/RDMAStack.cc:48
> > #9  0x00007fffe6fee873 in AsyncConnection::_process_connection
> (this=this@entry=0x7fffa0023450) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/AsyncConnection.cc:864
> > #10 0x00007fffe6ff5148 in AsyncConnection::process (this=0x7fffa0023450)
> at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/AsyncConnection.cc:812
> > #11 0x00007fffe6e5d6ac in EventCenter::process_events
> (this=this@entry=0x7fffa806e6d0, timeout_microseconds=<optimized out>,
> timeout_microseconds@entry=30000000)
> >     at /usr/src/debug/ceph-11.0.2-1611-geb25965/src/msg/async/Event.cc:430
> > #12 0x00007fffe6e5fbba in NetworkStack::__lambda1::operator()
> (__closure=0x7fffa80f5630) at /usr/src/debug/ceph-11.0.2-1611-
> geb25965/src/msg/async/Stack.cc:47
> > #13 0x00007fffe3e71220 in std::(anonymous
> namespace)::execute_native_thread_routine (__p=<optimized out>) at
> ../../../../../libstdc++-v3/src/c++11/thread.cc:84
> > #14 0x00007fffe5ae9dc5 in start_thread (arg=0x7fffcbb93700) at
> pthread_create.c:308
> > #15 0x00007fffe561321d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> >
> >
> >> -----Original Message-----
> >> From: Avner Ben Hanoch
> >> Sent: Sunday, November 20, 2016 15:22
> >> To: 'Haomai Wang' <haomai@xxxxxxxx>; Marov Aleksey
> >> <Marov.A@xxxxxxxxxx>
> >> Cc: Sage Weil <sweil@xxxxxxxxxx>; ceph-devel@xxxxxxxxxxxxxxx
> >> Subject: RE: ceph issue
> >>
> >> This PR doesn't have any effect on the assertion.  I still get it in same
> situation
> >>
> >> ---
> >> $ ./fio --ioengine=rbd --invalidate=0 --rw=write --bs=10M --numjobs=1 --
> >> clientname=admin --pool=rbd --iodepth=128 --rbdname=img2g --name=1
> >> 1: (g=0): rw=write, bs=10M-10M/10M-10M/10M-10M, ioengine=rbd,
> >> iodepth=128
> >> fio-2.13-91-gb678
> >> Starting 1 process
> >> rbd engine: RBD version: 0.1.11
> >> /mnt/data/avnerb/rpmbuild/BUILD/ceph-11.0.2-1611-
> >> geb25965/src/log/SubsystemMap.h: In function 'bool
> >> ceph::logging::SubsystemMap::should_gather(unsigned int, int)' thread
> >> 7f7c7b3a5700 time 2016-11-20 13:17:56.090289
> >> /mnt/data/avnerb/rpmbuild/BUILD/ceph-11.0.2-1611-
> >> geb25965/src/log/SubsystemMap.h: 62: FAILED assert(sub <
> m_subsys.size())
> >> ceph version 11.0.2-1611-geb25965
> >> (eb25965b74aa1a0379d091169d80786f30c72a8b)
> >> ---
> >>
> >> > -----Original Message-----
> >> > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-
> >> > owner@xxxxxxxxxxxxxxx] On Behalf Of Haomai Wang
> >> > Subject: Re: ceph issue
> >> >
> >> > sorry, I got the issue. I submitted a
> >> > pr(https://github.com/ceph/ceph/pull/12068). plz tested with this.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux