Re: [PATCH 09/19] VFS: add _async versions of the various directory modifying inode_operations

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Sun, 9 Feb 2025 04:57:05 +0000

On Sun, Feb 09, 2025 at 01:09:10AM +0000, Al Viro wrote:
> On Fri, Feb 07, 2025 at 10:41:34PM +0000, Al Viro wrote:
> 
> > I'm sorry, but I don't buy the "complete with no lock on directory"
> > part - not without a verifiable proof of correctness of the locking
> > scheme.  Especially if you are putting rename into the mix.
> > 
> > And your method prototypes pretty much bake that in.
> > 
> > *IF* we intend to try going that way (and I'm not at all convinced
> > that it's feasible - locking aside, there's also a shitload of fun
> > with fsnotify, audit, etc.), let's make those new methods take
> > a single argument - something like struct mkdir_args, etc., with
> > inlines for extracting individual arguments out of that.  Yes, it's
> > ugly, but it allows later changes without a massive headache on
> > each calling convention modification.
> > 
> > Said that, an explicit description of locking scheme and a proof of
> > correctness (at least on the "it can't deadlock" level) is, IMO,
> > a hard requirement for the entire thing, async or no async.
> > 
> > We *do* have such for the current locking scheme.
> 
> While we are at it, the locking order is... interesting.  You
> have
> 	* parent's ->i_rwsem before child's d_update_lock()
> 	* for a child, d_update_lock() before ->i_rwsem
> and that - on top of ordering between ->i_rwsem of various
> inodes.
> 
> Do you actually have a proof that it's deadlock-free?

Note that "child's d_update_lock()" might very well be sleeping
on something that is no longer the parent's child, so the
ordering by depth, with ->i_rwsem and d_update_lock interspersed
does not hold.

What am I missing here?  I'd been trying to come up with
a proof of deadlock avoidance, but... no luck so far.