On Mon, Oct 25, 2021 at 02:17:58PM -0700, Linus Torvalds wrote: > On Mon, Oct 25, 2021 at 2:04 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > > Is secretmem different? We're trying to count how many of these we have, > > this is a common pattern in, for example, the network code which does > > this kind of thing a lot. > > Yes, secretmem is different. Prefix: I'm fine with this being whatever; it's a corner case, so this reply is mainly about nailing down the rationales for future decisions. > A refcount being zero means that the data it referenced no longer exists. I don't disagree with this definition, but I would like to understand how some other use-cases fit into this. The case of basic allocated object memory lifetime reference counting we all agree on. What about the case of what I see that is more like a "shared resource usage count" where the shared resource doesn't necessarily disappear when we reach "no users"? i.e. there is some resource, and it starts its life with no one using it (count = 1). Then we have users declaring their use (count = 2) and later release (count = 1) of the resource. It's not really ever unallocated when users == 0 (count == 1), but we might use the usage counter for other things and don't want to let it go negative or ever increment from zero (in case zero is used for marking the resource unavailable forever). For example, protocols knowing if there are any sockets left open, crypto API usage counts, kernel module usage counts, etc. I don't see as clear a distinction between secretmem and the above examples. The question being answered is "how many users do I have?" and we want to make sure we don't end up with overflow or underflow given the (unfortunately) common case of reference counting kernel bugs. The fact that secretmem doesn't have its own allocation to free when it hits 0 seems like just an implementation detail of this particular resource usage counter. But, ignoring secretmem for a moment, what about a specific example from above: the module loader's refcnt atomic_t, which is actually maintaining an allocation (the module), and uses usage counts of 0 to mean "I am removing this module", and 1 is "I have no users", 2 is "1 user", etc. It seems like this should use refcount_t to me? > Stop arguing for garbage. It was wrong, just admit it. The semantics > for "refcount" is something else than what that code had. As a result, > it got reverted. I've applied Willy's patch that actually makes sense. > > Arguing for garbage in the name of "security" is still garbage. In > fact, it only causes confusion, and as such is likely to result in > problems - including security problems - later. Sure, and reasonable people can disagree about what is garbage. :) I see using a refcount_t here as a not unreasonable way to protect against potential future security problems if the scope of secretmem ever grows and more than just hibernation starts to care about this usage counter. > Because confusion about semantics is bad. And that was what that patch was. > > And I want to state - again - how dangerous this "refcounts are always > prefereable to atomics" mental model is. Refcounts are _not_ > fundamentally preferable to atomics. They are slower, bigger, and have > completely different semantics. I will push back on this; I don't think that's a fair assessment at all. For storage size, refcount_t is identical to atomic_t (refcount_t _is_ an atomic_t). But perhaps you meant code size? Yes, the refcount_t helpers are technically larger. But like speed, where refcount_t is also technically slower, neither size nor speed were so much changed that we kept around the routines that made it exactly the same speed and grew the instruction count by 1 (the original x86-specific refcount_t implementation was just as fast as atomic_t). I don't see speed nor size alone to be a good reason to say "don't use refcount_t". But yes, I agree about the different semantics: there are very specific memory ordering assumptions that tend to be more strict than atomic_t (which IMO actually makes it more suitable than atomic_t for shared usage counts). > So if something isn't a refcount, it damn well shouldn't use "refcount_t". Again, I don't disagree, but since it looked like a refcount to me, I'd like to understand what why we don't see this case the same way. Since I agree that secretmem is currently pretty iffy (nothing actually allocated to track, only system state), I'll ask a slightly different question: should the module loader use refcount_t? If not, why? -- Kees Cook