Re: [RFC PATCH] fs/lock: increase the filp's reference for Posix-style locks

Xiubo Li <xiubli@xxxxxxxxxx> · Mon, 7 Nov 2022 20:44:24 +0800

On 07/11/2022 20:29, Jeff Layton wrote:
On Mon, 2022-11-07 at 20:03 +0800, Xiubo Li wrote:
On 07/11/2022 18:33, Jeff Layton wrote:
On Mon, 2022-11-07 at 17:52 +0800, xiubli@xxxxxxxxxx wrote:
[...]

diff --git a/io_uring/openclose.c b/io_uring/openclose.c
index 67178e4bb282..5a12cdf7f8d0 100644
--- a/io_uring/openclose.c
+++ b/io_uring/openclose.c
@@ -212,6 +212,7 @@ int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
   int io_close(struct io_kiocb *req, unsigned int issue_flags)
   {
   	struct files_struct *files = current->files;
+	fl_owner_t owner = file_lock_make_thread_owner(files);
   	struct io_close *close = io_kiocb_to_cmd(req, struct io_close);
   	struct fdtable *fdt;
   	struct file *file;
@@ -247,7 +248,7 @@ int io_close(struct io_kiocb *req, unsigned int issue_flags)
   		goto err;
   
   	/* No ->flush() or already async, safely close from here */
-	ret = filp_close(file, current->files);
+	ret = filp_close(file, owner);
   err:
   	if (ret < 0)
   		req_set_fail(req);
I think this is the wrong approach to fixing this. It also looks like
you could hit a similar problem with OFD locks and this patch wouldn't
address that issue.
For the OFD locks they will set the 'file' struct as the owner just as
the flock does, it should be okay and I don't think it has this issue if
my understanding is correct here.

They set the the owner to "file", but they don't hold a reference to it.
With OFD locks, the file is what holds references to the lock, not the
reverse.

Yeah, right. But for both OFD and flock they shouldn't hit this issue, 
because it when removing all the locks having the same owner, which is 
the 'file', passed by filp_close(filp), the 'file' reference counter 
must be larger than 0. Because the filp_close() is still using it.

This is why using the thread id as the owner is a special case for 
Posix-style lock.


The real bug seems to be that ceph_fl_release_lock dereferences fl_file,
at a point when it shouldn't rely on that being valid. Most filesystems
stash some info in fl->fl_u if they need to do bookkeeping after
releasing a lock. Perhaps ceph should be doing something similar?
This is the 'filp' memory in filp_close(filp, ...):

crash> file.f_path.dentry,f_inode 0xffff952d7ab46200
    f_path.dentry = 0xffff9521b121cb40
    f_inode = 0xffff951f3ea33550,

We can see the 'f_inode' is pointing to the correct inode memory.



While later in 'ceph_fl_release_lock()':

41 static void ceph_fl_release_lock(struct file_lock *fl)
42 {
43     struct ceph_file_info *fi = fl->fl_file->private_data;
44     struct inode *inode = file_inode(fl->fl_file);
45     struct ceph_inode_info *ci = ceph_inode(inode);
46     atomic_dec(&fi->num_locks);
47     if (atomic_dec_and_test(&ci->i_filelock_ref)) {
48         /* clear error when all locks are released */
49         spin_lock(&ci->i_ceph_lock);
50         ci->i_ceph_flags &= ~CEPH_I_ERROR_FILELOCK;
51         spin_unlock(&ci->i_ceph_lock);
52     }
53 }

You only need the inode for most of this. The exception is
fi->num_locks, so you may need to test for that in a different way.

It crashed in Line#47 and the 'fl->fl_file' memory is:

crash> file.f_path.dentry,f_inode 0xffff952d4ebd8a00
    f_path.dentry = 0x0
    f_inode = 0x0,

Please NOTE: the 'filp' and 'fl->fl_file' are two different 'file struct'.

Yep, I understand the bug. I just don't like the proposed fix. :)

Yeah, I also think this approach is ugly :-)

Can we fix this by using 'fl->fl_u' here ?

Probably. You could take and hold an inode reference in there, and maybe
add a function that looks at whether there are any locks held against a
particular file, rather than trying to count locks in ceph_file_info.

Okay, this sounds good.

Let me try this tomorrow.

I was also thinking I could just call the 'get_file(file)' in
ceph_lock() and then in ceph_fl_release_lock() release the reference
counter. How about this ?

That may work too, though again, I'd be worried about cyclical
dependencies, particularly with OFD locks. If the lock holds a reference
to the file, then can the file's refcount ever go to zero if the lock is
never explicitly released? I think not.

You may also need to consider flock locks too, since they have similar
ownership semantics to OFD locks.

I will send a V2 later.

Thanks Jeff!

- Xiubo