Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > On Thu, Nov 23, 2023 at 12:37:43PM -0500, Gabriel Krisman Bertazi wrote: >> > That's the problem I'd been talking about - there is a class of situations >> > where the work done by ext4_lookup() to set the state of dentry gets >> > completely lost. After lookup you do have a dentry in the right place, >> > with the right name and inode, etc., but with NULL >> > ->d_op->d_revalidate. >> >> I get the problem now. I admit to not understanding all the details yet, >> which is why I haven't answered directly, but I understand already how >> it can get borked. I'm studying your explanation. >> >> Originally, ->d_op could be propagated trivially since we had sb->s_d_op >> set, which would be set by __d_alloc, but that is no longer the case >> since we combined fscrypt and CI support. >> >> What I still don't understand is why we shouldn't fixup ->d_op when >> calling d_obtain_alias (before __d_instantiate_anon) and you say we >> better do it in d_splice_alias. The ->d_op is going to be the same >> across the filesystem when the casefold feature is enabled, regardless >> if the directory is casefolded. If we set it there, the alias already >> has the right d_op from the start. > > *blink* > > A paragraph above you've said that it's not constant over the entire > filesystem. The same ->d_op is used by every dentry in the filesystem if the superblock has the casefold bit enabled, regardless of whether a specific inode is casefolded or not. See generic_set_encrypted_ci_d_ops in my tree. It is called unconditionally by ext4_lookup and only checks the superblock: void generic_set_encrypted_ci_d_ops(struct dentry *dentry) { if (dentry->d_sb->s_encoding) { d_set_d_op(dentry, &generic_encrypted_ci_dentry_ops); return; } ... What I meant was that this used to be set once at sb->s_d_op, and propagated during dentry allocation. Therefore, the propagation to the alias would happen inside __d_alloc. Once we enabled fscrypt and casefold to work together, sb->s_d_op is NULL and we always set the same handler for every dentry during lookup. > Look, it's really simple - any setup work of that sort done in ->lookup() > is either misplaced, or should be somehow transferred over to the alias > if one gets picked. > > As for d_obtain_alias()... AFAICS, it's far more limited in what information > it could access. It knows the inode, but it has no idea about the parent > to be. Since it has the inode, d_obtain_alias has the superblock. I think that's all we need for generic_set_encrypted_ci_d_ops. > The more I look at that, the more it feels like we need a method that would > tell the filesystem that this dentry is about to be spliced here. 9p is > another place where it would obviously simplify the things; ocfs2 'attach > lock' stuff is another case where the things get much more complicated > by having to do that stuff after splicing, etc. > > It's not even hard to do: > > 1. turn bool exchange in __d_move() arguments into 3-value thing - move, > exchange or splice. Have the callers in d_splice_alias() and __d_unalias() > pass "splice" instead of false (aka normal move). > > 2. make __d_move() return an int (normally 0) > > 3. if asked to splice and if there's target->d_op->d_transfer(), let > __d_move() call it right after > spin_lock_nested(&dentry->d_lock, 2); > spin_lock_nested(&target->d_lock, 3); > in there. Passing it target and dentry, obviously. In unlikely case > of getting a non-zero returned by the method, undo locks and return > that value to __d_move() caller. > > 4. d_move() and d_exchange() would ignore the value returned by __d_move(); > __d_unalias() turn > __d_move(alias, dentry, false); > ret = 0; > into > ret = __d_move(alias, dentry, Splice); > d_splice_alias() turn > __d_move(new, dentry, false); > write_sequnlock(&rename_lock); > into > err = __d_move(new, dentry, Splice); > write_sequnlock(&rename_lock); > if (unlikely(err)) { > dput(new); > new = ERR_PTR(err); > } > (actually, dput()-on-error part would be common to all 3 branches > in there, so it would probably get pulled out of that if-else if-else). > > I can cook a patch doing that (and convert the obvious beneficiaries already > in the tree to it) and throw it into dcache branch - just need to massage > the series in there for repost... if you can write that, I'll definitely appreciate it. It will surely take me much longer to figure it out myself. -- Gabriel Krisman Bertazi