On Tue, Dec 10, 2019 at 03:21:04PM -0700, Jens Axboe wrote: > On 12/10/19 3:04 PM, Jann Horn wrote: > > [context preserved for additional CCs] > > > > On Tue, Dec 10, 2019 at 4:57 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > >> Recently had a regression that turned out to be because > >> CONFIG_REFCOUNT_FULL was set. > > > > I assume "regression" here refers to a performance regression? Do you > > have more concrete numbers on this? Is one of the refcounting calls > > particularly problematic compared to the others? > > Yes, a performance regression. io_uring is using io-wq now, which does > an extra get/put on the work item to make it safe against async cancel. > That get/put translates into a refcount_inc and refcount_dec per work > item, and meant that we went from 0.5% refcount CPU in the test case to > 1.5%. That's a pretty substantial increase. > > > I really don't like it when raw atomic_t is used for refcounting > > purposes - not only because that gets rid of the overflow checks, but > > also because it is less clear semantically. > > Not a huge fan either, but... It's hard to give up 1% of extra CPU. You > could argue I could just turn off REFCOUNT_FULL, and I could. Maybe > that's what I should do. But I'd prefer to just drop the refcount on the > io_uring side and keep it on for other potential useful cases. There is no CONFIG_REFCOUNT_FULL any more. Will Deacon's version came out as nearly identical to the x86 asm version. Can you share the workload where you saw this? We really don't want to regression refcount protections, especially in the face of new APIs. Will, do you have a moment to dig into this? -Kees > > >> Our ref count usage is really simple, > > > > In my opinion, for a refcount to qualify as "really simple", it must > > be possible to annotate each relevant struct member and local variable > > with the (fixed) bias it carries when alive and non-NULL. This > > refcount is more complicated than that. > > :-( > > -- > Jens Axboe > -- Kees Cook