Re: boost and valgrind

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 19, 2022 at 1:49 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
>
> the rgw teuthology suite started seeing lots of valgrind issues a
> couple weeks ago. we're tracking them in
> https://tracker.ceph.com/issues/56500
>
> as i understand it, valgrind is complaining about stack memory access
> outside of the current thread's stack:
>
> > <auxwhat>Address 0x57f47f60 is on thread 135's stack</auxwhat>
>
> rgw is using coroutine stacks allocated by boost::context, which
> explains why valgrind is confused. boost::context supports valgrind's
> annotations for these stacks (VALGRIND_STACK_REGISTER), but they
> aren't enabled by default
>
> in March 2020 with https://github.com/ceph/ceph/pull/34043, Adam added
> a cmake option WITH_BOOST_VALGRIND that enables this 'valgrind' option
> for ceph's bundled boost build. in
> https://github.com/ceph/ceph-build/pull/1736, we enabled this for the
> 'notcmalloc' builds that we ran our valgrind tests against
>
> however, we stopped doing 'notcmalloc' builds entirely after
> https://github.com/ceph/teuthology/pull/1618 added the valgrind
> options necessary to run against the normal tcmalloc builds. so we
> lost this fix, but the rgw suite had been getting clean valgrind
> results until just recently
>
> i've confirmed that the issues do go away with WITH_BOOST_VALGRIND
> enabled, but i really don't want to require a special build flavor for
> it
>
> does anyone know what changed here? are valgrind issues popping up
> anywhere else?

this topic was discussed again yesterday in the ceph leadership call.
we've been trying to decide whether to enable WITH_BOOST_VALGRIND
globally, or to add it back as a new build flavor just for teuthology
runs

in https://tracker.ceph.com/issues/56500, Mark Kogan showed that
WITH_BOOST_VALGRIND has little to no effect on rgw performance. Mark
Nelson did similar testing with rbd trying to rule out a performance
hit there as well - the results had quite a bit of variation, so it
was hard to rule out completely

to narrow the scope of this investigation, i did a grep of the entire
boost project and confirmed that boost::context and boost::coroutine
are the only two libraries that mention this BOOST_USE_VALGRIND flag.
so i think it's safe to assume that these valgrind builds will have no
effect outside of rgw

i've raised https://github.com/ceph/ceph-build/pull/2043 to enable
WITH_BOOST_VALGRIND on all shaman builds, and will ask the component
leads for approval there

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux