On Mon, Oct 25, 2021 at 3:30 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > A refcount being zero means that the data it referenced no longer exists. > > I don't disagree with this definition, but I would like to understand how > some other use-cases fit into this. I certainly hope that there are no other use-cases for 'recount_t', because that "zero is invalid" is very much part of the semantics. If we want other semantics, it should be a new type. > What about the case of what > I see that is more like a "shared resource usage count" where the shared > resource doesn't necessarily disappear when we reach "no users"? So I think that's really "atomic_t". And instead of saturating, people should always check such shared resources for limits. > i.e. there is some resource, and it starts its life with no one using it > (count = 1). You are already going off into the weeds. That's not a natural thing to do. It's already confusing. Really. Read that sentence yourself, and read it like an outsider. "No one is using it, so count == 1" is a nonsensican statement on the face of it. You are thinking of a refcount_t trick, not some sane semantics. Yes, we have played off-by-one games in the kernel before. We've done it for various subtle reasons. For example, traditionally, on x86, with atomic counting there are three special situations: negative, 0 and positive. So if you use the traditional x86 counting atomics (just add/sub/inc/dec, no xadd) then there are situations where you can get more information about the result in %eflags if you don't use zero as the initial value, but -1. Because then you can do "inc", and if ZF is set, you know you were the _first_ person to increment it. And when you use "dec", and SF is set afterwards, you know you are the _last_ person to decrement it. That was useful when things like "xadd" weren't available, and cmpxchg loops are expensive. So we used to have counters where -1 was that "zero point". Very similar to your "1 is the zero point". But was it _logical_? No. It was an implementation trick. I think we've removed all those cases because it was so subtle and confusing (but maybe we still have it somewhere - I did not check). So we've certainly played those kinds of games. But it had better be for a really good reason. > I don't see as clear a distinction between secretmem and the above > examples. I really don't see what's wrong with 'atomic_t', and just checking for limits. Saturating counters are EVIL AND BAD. They are a DoS waiting to happen. Once you saturate, the machine is basically dead. You may have "protected" against some attack, but you did so by killing the machine and making the resource accounting no longer work. So if a user can ever trigger a saturating counter, that's a big big problem in itself. In contrast, an 'atomic_t' with a simple limit? It just works. And it doesn't need illogical tricks to work. Stop thinking that refcount_t is a good type. Start realizing the downsides. Start understanding that saturation is a HORRENDOUSLY BAD solution, and horrible QoI. Linus