On Thu, Mar 02, 2023 at 11:38:50AM -0800, Kees Cook wrote: > On Thu, Mar 02, 2023 at 11:10:03AM -0800, Linus Torvalds wrote: > > On Thu, Mar 2, 2023 at 11:03 AM Linus Torvalds > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > It might be best if we actually exposed it as a SLAB_SKIP_ZERO thing, > > > just to make it possible to say - exactly in situations like this - > > > that this particular slab cache has no advantage from pre-zeroing. > > > > Actually, maybe it's just as well to keep it per-allocation, and just > > special-case getname_flags() itself. > > > > We could replace the __getname() there with just a > > > > kmem_cache_alloc(names_cachep, GFP_KERNEL | __GFP_SKIP_ZERO); > > > > we're going to overwrite the beginning of the buffer with the path we > > copy from user space, and then we'd have to make people comfortable > > with the fact that even with zero initialization hardening on, the > > space after the filename wouldn't be initialized... > > Yeah, I'd love to have a way to safely opt-out of always-zero. The > discussion[1] when we originally did this devolved into a guessing > game on performance since no one could actually point to workloads > that were affected by it, beyond skbuff[2]. So in the interest of not > over-engineering a solution to an unknown problem, the plan was once > someone found a problem, we could find a sensible solution at that > time. And so here we are! :) > > I'd always wanted to avoid a "don't zero" flag and instead adjust APIs so > the allocation could include a callback to do the memory content filling > that would return a size-that-was-initialized result. That way we don't > end up in the situations we've seen so many times with drivers, etc, > where an uninit buffer is handed off and some path fails to actually > fill it with anything. However, in practice, I think this kind of API > change becomes really hard to do. > Having not been following init_on_alloc very closely myself, I'm a bit surprised that an opt-out flag never made it into the final version. Was names_cachep considered in those earlier discussions? I think that's a pretty obvious use case for an opt-out. Every syscall that operates on a path allocates a 4K buffer from names_cachep. - Eric