iHas anyone tested the commits before and after the merge of the protocol refactor and confirmed that that is in fact the culprit? The Mutex -> ceph::mutex changes that have been trickling in might also be responsible as those may have accidentally changed some Mutex(.., false for no lockdep) to ceph::mutex (always lockdep), which would also affect the debug builds only. sage On Thu, 25 Oct 2018, Yan, Zheng wrote: > On Thu, Oct 25, 2018 at 7:17 AM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote: > > > > To clarify, what is the result with "-g -O2"? > > > > there is no slowdown in RelWithDebInfo mode > > > > Matt > > > > On Wed, Oct 24, 2018 at 5:59 PM, Ricardo Dias <rdias@xxxxxxxx> wrote: > > > > > > > > > On 24/10/2018 21:54, Gregory Farnum wrote: > > >> Do we understand why debug mode got so much slower? Is there something > > >> we can do to improve it? > > > > > > I believe the reason for the slowdown is due to the increase of number > > > of functions that are used in the new implementation. While in the > > > previous implementation, the state machine was implemented with just two > > > big functions (with a switch/case block in each), the new implementation > > > uses one function per protocol state. > > > I'm not familiar with what the compiler generates in Debug mode, but I > > > imagine that now there are much more debug symbols to track, and less > > > optimizations that the compiler can preform without confusing the > > > debugger tools. > > > > > > I currently don't see a way to improve the performance in Debug mode. > > > One thing we can do though is to also check the performance when > > > compiling in RelWithDebugInfo mode. If it preforms similar to the > > > Release mode, at least we still have debug symbols to help in > > > identifying some problems. > > > > > >> > > >> We are for instance seeing new issues with the messenger in our > > >> testing, apparently because the reduced speed opens up race conditions > > >> much wider. In this case that's good for us, but it could easily go > > >> the other way as well and I'm concerned about not finding new issues > > >> in our testing if the difference is so substantial compared to what > > >> will be deployed by users. > > > > > > Maybe we can build packages for the binaries compiled with the two modes > > > (Debug and Release) and be able to specify which one to use in each test > > > run. > > > > > >> -Greg > > >> On Wed, Oct 24, 2018 at 3:18 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > >>> > > >>> Only ceph complied in debug mode has the regression. Ceph complied in > > >>> release mode has no regression. Sorry for the noisy. > > >>> > > >>> Yan, Zheng > > >>> > > >>> > > >>> > > >>> On Wed, Oct 24, 2018 at 1:46 PM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > >>>> > > >>>> Hi, > > >>>> > > >>>> Yesterday I checked how fast ceph-mds can process requests (a client > > >>>> keeps sending getattr request of root inode). Requests rate I got is > > >>>> only about half of same test I did a few weeks ago. Perf profile of > > >>>> ceph-mds shows that messenger functions used more CPU time compared to > > >>>> mimic code. Performance result and perf profiles are at > > >>>> http://tracker.ceph.com/issues/36561. > > >>>> > > >>>> Regards > > >>>> Yan, Zheng > > >> > > > > > > -- > > > Ricardo Dias > > > Senior Software Engineer - Storage Team > > > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, > > > HRB 21284 > > > (AG Nürnberg) > > > > > > > > > > > -- > > > > Matt Benjamin > > Red Hat, Inc. > > 315 West Huron Street, Suite 140A > > Ann Arbor, Michigan 48103 > > > > http://www.redhat.com/en/technologies/storage > > > > tel. 734-821-5101 > > fax. 734-769-8938 > > cel. 734-216-5309 > >