On 6/14/22 4:36 PM, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:
On 6/14/22 12:07 AM, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:
For async create we will always try to choose the auth MDS of frag
the dentry belonged to of the parent directory to send the request
and ususally this works fine, but if the MDS migrated the directory
to another MDS before it could be handled the request will be
forwarded. And then the auth cap will be changed.
We need to update the auth cap in this case before the request is
forwarded.
URL: https://tracker.ceph.com/issues/55857
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
fs/ceph/file.c | 12 +++++++++
fs/ceph/mds_client.c | 58 ++++++++++++++++++++++++++++++++++++++++++++
fs/ceph/super.h | 2 ++
3 files changed, 72 insertions(+)
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 0e82a1c383ca..54acf76c5e9b 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -613,6 +613,7 @@ static int ceph_finish_async_create(struct inode *dir, struct inode *inode,
struct ceph_mds_reply_inode in = { };
struct ceph_mds_reply_info_in iinfo = { .in = &in };
struct ceph_inode_info *ci = ceph_inode(dir);
+ struct ceph_dentry_info *di = ceph_dentry(dentry);
struct timespec64 now;
struct ceph_string *pool_ns;
struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb);
@@ -709,6 +710,12 @@ static int ceph_finish_async_create(struct inode *dir, struct inode *inode,
file->f_mode |= FMODE_CREATED;
ret = finish_open(file, dentry, ceph_open);
}
+
+ spin_lock(&dentry->d_lock);
+ di->flags &= ~CEPH_DENTRY_ASYNC_CREATE;
+ wake_up_bit(&di->flags, CEPH_DENTRY_ASYNC_CREATE_BIT);
+ spin_unlock(&dentry->d_lock);
Question: shouldn't we initialise 'di' *after* grabbing ->d_lock? Ceph
code doesn't seem to be consistent with this regard, but my understanding
is that doing it this way is racy. And if so, some other places may need
fixing.
Yeah, it should be.
BTW, do you mean some where like this:
if (!test_bit(CEPH_DENTRY_ASYNC_UNLINK_BIT, &di->flags))
?
If so, it's okay and no issue here.
No, I meant patterns like the one above, where you initialize a pointer to
a struct ceph_dentry_info (from ->d_fsdata) and then grab ->d_lock. For
example, in splice_dentry(). I think there are a few other places where
this pattern can be seen, even in other filesystems. So maybe it's not
really an issue, and that's why I asked: does this lock really protects
accesses to ->d_fsdata?
Okay, get it.
I think it should be okay, the dentry reference has been increased
during these. In theory the ->d_fsdata shouldn't be released while we
are accessing it if I didn't miss something important.
-- Xiubo
Cheers,