On Mon, May 6, 2019 at 9:15 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > Umm... Where would you put the cutoff for try_dget()? 1G? Because > 2G-<something relatively small> is risky - have it reached, then > get the rest of the way to 2G by normal dget() and you've got trouble. I'd make the limit be 2G exactly like the page count. Negative counts are fine - they work exactly like large integers. It's only 0 that is special. So do something like this: - make dget() WARN_ONCE(), and perhaps set a flag to start background dentry pruning, if the dentry count is negative ("big integer") after the lockref_get() - add a try_dget(), which returns the dentry or NULL (and is "must_check") and just refuses to increment the ref past the 2G mark - add the "limit negative dentries" patches that were already written for other reasons by Waiman Long. - and exactly like the page ref count, the negative values can be tested non-atomically without worrying about races, because it's not a "hard" limit. It takes a *looong* time (and a lot of memory) to go from 2G to actually overflowing - for the same "not a hard limit", use try_dget() in a couple of strategic places that are easy to error out for and that are particularly easily user-triggerable. It's not clear if this is even needed, since the only obviously user-triggerable case is the negative dentry one - everything else really needs an actual user ref, and the soft "start to try to prune if any dentry ref goes negative" will take care of the "we just have a ton of unused but cached dentries case. All pretty much exactly like the page count. The fact that we have that "slop" of 2 _billion_ references between "oh, the recount went negative" and "oops, now we overflowed and that would be fatal" really means that we have a lot of time and flexibility to handle things. If an attacker has to open two billion files, the attacker is going to spend a lot of time that we can mitigate. Linus