Re: [PATCH v3] ceph: fix use-after-free bug for inodes when flushing capsnaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/6/23 18:21, Ilya Dryomov wrote:
On Tue, Jun 6, 2023 at 2:55 AM <xiubli@xxxxxxxxxx> wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>

There is a race between capsnaps flush and removing the inode from
'mdsc->snap_flush_list' list:

    == Thread A ==                     == Thread B ==
ceph_queue_cap_snap()
  -> allocate 'capsnapA'
  ->ihold('&ci->vfs_inode')
  ->add 'capsnapA' to 'ci->i_cap_snaps'
  ->add 'ci' to 'mdsc->snap_flush_list'
     ...
    == Thread C ==
ceph_flush_snaps()
  ->__ceph_flush_snaps()
   ->__send_flush_snap()
                                 handle_cap_flushsnap_ack()
                                  ->iput('&ci->vfs_inode')
                                    this also will release 'ci'
                                     ...
                                       == Thread D ==
                                 ceph_handle_snap()
                                  ->flush_snaps()
                                   ->iterate 'mdsc->snap_flush_list'
                                    ->get the stale 'ci'
  ->remove 'ci' from                ->ihold(&ci->vfs_inode) this
    'mdsc->snap_flush_list'           will WARNING

To fix this we will increase the inode's i_count ref when adding 'ci'
to the 'mdsc->snap_flush_list' list.

Cc: stable@xxxxxxxxxxxxxxx
URL: https://bugzilla.redhat.com/show_bug.cgi?id=2209299
Reviewed-by: Milind Changire <mchangir@xxxxxxxxxx>
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---

V3:
- Fix two minor typo in commit comments.



  fs/ceph/caps.c | 6 ++++++
  fs/ceph/snap.c | 4 +++-
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index feabf4cc0c4f..7c2cb813aba4 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1684,6 +1684,7 @@ void ceph_flush_snaps(struct ceph_inode_info *ci,
         struct inode *inode = &ci->netfs.inode;
         struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc;
         struct ceph_mds_session *session = NULL;
+       int put = 0;
Hi Xiubo,

Nit: renaming this variable to need_put and making it a bool would
communicate the intent better.

Hi Ilya

Sure, will update it.

         int mds;

         dout("ceph_flush_snaps %p\n", inode);
@@ -1728,8 +1729,13 @@ void ceph_flush_snaps(struct ceph_inode_info *ci,
                 ceph_put_mds_session(session);
         /* we flushed them all; remove this inode from the queue */
         spin_lock(&mdsc->snap_flush_lock);
+       if (!list_empty(&ci->i_snap_flush_item))
+               put++;
What are the cases when ci is expected to not be on snap_flush_list
list (and therefore there is no corresponding reference to put)?

The reason I'm asking is that ceph_flush_snaps() is called from two
other places directly (i.e. without iterating snap_flush_list list) and
then __ceph_flush_snaps() is called from two yet other places.  The
problem that we are presented with here is that __ceph_flush_snaps()
effectively consumes a reference on ci.  Is ci protected from being
freed by handle_cap_flushsnap_ack() very soon after __send_flush_snap()
returns in all these other places?

There are 4 places will call the 'ceph_flush_snaps()':

Cscope tag: ceph_flush_snaps
   #   line  filename / context / line
   1   3221  fs/ceph/caps.c <<__ceph_put_cap_refs>>
             ceph_flush_snaps(ci, NULL);
   2   3336  fs/ceph/caps.c <<ceph_put_wrbuffer_cap_refs>>
             ceph_flush_snaps(ci, NULL);
   3   2243  fs/ceph/inode.c <<ceph_inode_work>>
             ceph_flush_snaps(ci, NULL);
   4    941  fs/ceph/snap.c <<flush_snaps>>
             ceph_flush_snaps(ci, &session);
Type number and <Enter> (q or empty cancels):

For #1 it will add the 'ci' to the 'mdsc->snap_flush_list' list by calling '__ceph_finish_cap_snap()' and then call the 'ceph_flush_snaps()' directly or defer call it in the queue work in #3.

The #3 is the reason why we need the 'mdsc->snap_flush_list' list.

For #2 it won't add the 'ci' to the list because it will always call the 'ceph_flush_snaps()' directly.

For #4 it will call 'ceph_flush_snaps()' by iterating the 'mdsc->snap_flush_list' list just before the #3 being triggered.

The problem only exists in case of #1 --> #4, which will make the stale 'ci' to be held in the 'mdsc->snap_flush_list' list after 'capsnap' and 'ci' being freed. All the other cases are okay because the 'ci' will be protected by increasing the ref when allocating the 'capsnap' and will decrease the ref in 'handle_cap_flushsnap_ack()' when freeing the 'capsnap'.

Note: the '__ceph_flush_snaps()' won't increase the ref. The 'handle_cap_flushsnap_ack()' will just try to decrease the ref and only in case the ref reaches to '0' will the 'ci' be freed.


Thanks

- Xiubo

Thanks,

                 Ilya





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux