On Mon, Oct 25, 2021 at 5:18 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > Right, sure, but it's not a rare pattern. Well, for an actual reference count it certainly isn't a rare pattern, and zero _is_ special, because at zero, you are now in use-after-free territory. But that's kind of the issue here: that really isn't what 'secretmem_users' was ever about. Zero isn't some "now we're use-after-free" situation. Quite the reverse. Zero ends up being the safe thing. So with that kind of "just count number of existing users", where zero isn't special, then refcount_t doesn't make sense. And refcount_t is for non-core stuff that has a lot of random kernel users that you can't easily verify. In contrast, 'secretmem_users' had exactly two sites that modified it, and one that tested it. > But these places need to check for insane > conditions too ("we got a -1 back -- this means there's a bug but what > do we do?"). Same for atomic_inc(): "oh, we're at our limit, do > something", but what above discovering ourselves above the limit? So honestly, "above the limit" is often perfectly fine too. It can be fine for two very different reasons: (a) racy checks are often much simpler and faster, and perfectly safe when the limit is "far away from overflow". (b) limits can change And (a) isn't just about "avoid special atomics". It's about doing the limit check optimistically outside locking etc. And (b) wasn't an issue here (where the only real use was ltierally "are there any users at all"), but in most _proper_ use cases you will want to have some resource limit that might be set by MIS. And might be changed dynamically. So it's entirely possible that somebody sets the limit to something smaller than the current user (prep for shutdown, whatever), without it being an error at all. The limit is for future work, not for past work. Easily happens with things like rlimits etc. > There's nothing about using the atomic_t primitives that enforces these > kinds of checks. (And there likely shouldn't be for atomic_t -- it's a > plain type.) But we likely need something that fills in this API gap > between atomic_t and refcount_t. I dispute the "need". This isn't as common as you claim. Most resource counting _is_ for "free when no longer used". And on the other end, you have the users that don't want refcount_t because they can't live with the limitations of that interface, like the page counts etc, that do it properly. So I think in 99% of all situations, the proper fix is to embed an "atomic_t" in the type it protects, and then have the helper functions that actually do it properly. Like we do for "get_page()" and friends. The "new type" isn't about the reference counting, it's about the data itself, and the atomic_t is just a part of it. Could we do something new type that warns on the "decrement past zero" and "overflow on increment"? Sure. But since they by _definition_ aren't about data lifetimes, they probably don't need saturation - you want the _warning_, but they aren't protecting data, since they aren't refcounts. Or could we have something even fancier, that is an actual defined range, and "overflow" is not "overflow in 32 bits", but "becomes bigger than X")? That gets more complex because now you'd have to encode the range in the type somehow. You could do it with actual static types (generate typedef names and code), or you could do it with types that have a more dynamic pointer to ranges (kind of like the sysfs interfaces do) or have the ranges embedded in the data structure itself. But honestly, the complexity downside seems to just dwarf the upside. Linus