On Fri, May 03, 2024 at 12:53:56AM +0200, Jann Horn wrote: > On Fri, May 3, 2024 at 12:34 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > If f_count reaches 0, calling get_file() should be a failure. Adjust to > > use atomic_long_inc_not_zero() and return NULL on failure. In the future > > get_file() can be annotated with __must_check, though that is not > > currently possible. > [...] > > static inline struct file *get_file(struct file *f) > > { > > - atomic_long_inc(&f->f_count); > > + if (unlikely(!atomic_long_inc_not_zero(&f->f_count))) > > + return NULL; > > return f; > > } > > Oh, I really don't like this... > > In most code, if you call get_file() on a file and see refcount zero, > that basically means you're in a UAF write situation, or that you > could be in such a situation if you had raced differently. It's > basically just like refcount_inc() in that regard. Shouldn't the system attempt to not make things worse if it encounters an inc-from-0 condition? Yes, we've already lost the race for a UaF condition, but maybe don't continue on. > And get_file() has semantics just like refcount_inc(): The caller > guarantees that it is already holding a reference to the file; and if Yes, but if that guarantee is violated, we should do something about it. > the caller is wrong about that, their subsequent attempt to clean up > the reference that they think they were already holding will likely > lead to UAF too. If get_file() sees a zero refcount, there is no safe > way to continue. And all existing callers of get_file() expect the > return value to be the same as the non-NULL pointer they passed in, so > they'll either ignore the result of this check and barrel on, or oops > with a NULL deref. > > For callers that want to actually try incrementing file refcounts that > could be zero, which is only possible under specific circumstances, we > have helpers like get_file_rcu() and get_file_active(). So what's going on in here: https://lore.kernel.org/linux-hardening/20240502223341.1835070-2-keescook@xxxxxxxxxxxx/ > Can't you throw a CHECK_DATA_CORRUPTION() or something like that in > there instead? I'm open to suggestions, but given what's happening with struct dma_buf above, it seems like this is a state worth checking for? -- Kees Cook