Re: ceph_assert() vs ceph_assert_always()

Gregory Farnum <gfarnum@xxxxxxxxxx> · Tue, 21 Nov 2023 08:46:11 -0800

I think if we start doing this, we will need to run significant
portions of our testing with ceph_assert enabled and disabled. Perhaps
we can shift to automatically-scheduled and release runs using the
release builds, and pre-merge integration tests running with asserts
enabled?

But I think we'd also need some kind of proactive check for side
effects before doing a switchover. Hopefully there aren't a lot, but
there are going to be some.
-Greg

On Tue, Nov 21, 2023 at 7:41 AM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
>
> we might start by picking some cpu-bound workload to measure the
> difference with ceph_asserts compiled out. encode/decode would
> probably be a good target, given all the assertions in bufferlist
> itself. that could give us some data to motivate further efforts here
>
> On Fri, Nov 17, 2023 at 2:00 PM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
> >
> > This might seem in the weeds for a lot of folks, but the density of (currently always-expanded) ceph_assert() calls in fast paths seems really significant, to me.
> >
> > Matt
> >
> > On Mon, Oct 2, 2023 at 3:01 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
> >>
> >> Matt recently raised the issue of ceph assertions in production code,
> >> which reminded me of Sage's 2016 pr
> >> https://github.com/ceph/ceph/pull/9969 that added a
> >> ceph_assert_always(). the idea was to eventually make ceph_assert()
> >> conditional on NDEBUG to match the behavior of libc's assert(),
> >> leaving ceph_assert_always() as the marked unconditional case
> >>
> >> i would love to see this finally happen, but there are some potential risks:
> >>
> >> * ceph_assert()s with side effects won't behave as expected in release
> >> builds. assert() documents this same issue at
> >> https://www.man7.org/linux/man-pages/man3/assert.3.html#BUGS. if we
> >> could at least identify these cases, we can switch them to
> >> ceph_assert_always()
> >>
> >> * in teuthology, we test the same release builds that we ship to
> >> users. that means teuthology won't catch the code paths that trigger
> >> debug assertions. if those lead to crashes, they could be much harder
> >> to debug without the assertions and backtraces
> >>
> >> * conversely, merging pull requests after a successful teuthology run
> >> may introduce new assertions in debug builds. it could be annoying for
> >> developers to track down and fix new assertions after pulling the
> >> latest main or stable release branch
> >>
> >> * unused variable warnings in release builds where ceph_assert() was
> >> the only reference. at least the compiler will catch all of these for
> >> us, and [[maybe_unused]] annotations can clear them up
> >>
> >>
> >> in general, do folks agree that this is a change worth making? if so,
> >> what can we do to mitigate the risks?
> >>
> >> if not, how should we handle the use of ceph_assert() vs raw assert()
> >> in new code? should there be some guidance in CodingStyle?
> >>
> >> as a half-measure, we might introduce a new ceph_assert_debug() as an
> >> alternative to raw assert(), then convert some existing uses of
> >> ceph_assert() on a case-by-case basis
> >>
> >
> >
> > --
> >
> > Matt Benjamin
> > Red Hat, Inc.
> > 315 West Huron Street, Suite 140A
> > Ann Arbor, Michigan 48103
> >
> > http://www.redhat.com/en/technologies/storage
> >
> > tel.  734-821-5101
> > fax.  734-769-8938
> > cel.  734-216-5309
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx