Re: [PATCH v2 4/5] cachefiles: cyclic allocation of msg_id to avoid reuse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jeff,

Thank you very much for your review!

On 2024/5/19 19:11, Jeff Layton wrote:
On Wed, 2024-05-15 at 20:51 +0800, libaokun@xxxxxxxxxxxxxxx wrote:
From: Baokun Li <libaokun1@xxxxxxxxxx>

Reusing the msg_id after a maliciously completed reopen request may cause
a read request to remain unprocessed and result in a hung, as shown below:

        t1       |      t2       |      t3
-------------------------------------------------
cachefiles_ondemand_select_req
  cachefiles_ondemand_object_is_close(A)
  cachefiles_ondemand_set_object_reopening(A)
  queue_work(fscache_object_wq, &info->work)
                 ondemand_object_worker
                  cachefiles_ondemand_init_object(A)
                   cachefiles_ondemand_send_req(OPEN)
                     // get msg_id 6
                     wait_for_completion(&req_A->done)
cachefiles_ondemand_daemon_read
  // read msg_id 6 req_A
  cachefiles_ondemand_get_fd
  copy_to_user
                                 // Malicious completion msg_id 6
                                 copen 6,-1
                                 cachefiles_ondemand_copen
                                  complete(&req_A->done)
                                  // will not set the object to close
                                  // because ondemand_id && fd is valid.

                 // ondemand_object_worker() is done
                 // but the object is still reopening.

                                 // new open req_B
                                 cachefiles_ondemand_init_object(B)
                                  cachefiles_ondemand_send_req(OPEN)
                                  // reuse msg_id 6
process_open_req
  copen 6,A.size
  // The expected failed copen was executed successfully

Expect copen to fail, and when it does, it closes fd, which sets the
object to close, and then close triggers reopen again. However, due to
msg_id reuse resulting in a successful copen, the anonymous fd is not
closed until the daemon exits. Therefore read requests waiting for reopen
to complete may trigger hung task.

To avoid this issue, allocate the msg_id cyclically to avoid reusing the
msg_id for a very short duration of time.

Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@xxxxxxxxxx>
---
  fs/cachefiles/internal.h |  1 +
  fs/cachefiles/ondemand.c | 20 ++++++++++++++++----
  2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 8ecd296cc1c4..9200c00f3e98 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -128,6 +128,7 @@ struct cachefiles_cache {
  	unsigned long			req_id_next;
  	struct xarray			ondemand_ids;	/* xarray for ondemand_id allocation */
  	u32				ondemand_id_next;
+	u32				msg_id_next;
  };
static inline bool cachefiles_in_ondemand_mode(struct cachefiles_cache *cache)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index f6440b3e7368..b10952f77472 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -433,20 +433,32 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
  		smp_mb();
if (opcode == CACHEFILES_OP_CLOSE &&
-			!cachefiles_ondemand_object_is_open(object)) {
+		    !cachefiles_ondemand_object_is_open(object)) {
  			WARN_ON_ONCE(object->ondemand->ondemand_id == 0);
  			xas_unlock(&xas);
  			ret = -EIO;
  			goto out;
  		}
- xas.xa_index = 0;
+		/*
+		 * Cyclically find a free xas to avoid msg_id reuse that would
+		 * cause the daemon to successfully copen a stale msg_id.
+		 */
+		xas.xa_index = cache->msg_id_next;
  		xas_find_marked(&xas, UINT_MAX, XA_FREE_MARK);
+		if (xas.xa_node == XAS_RESTART) {
+			xas.xa_index = 0;
+			xas_find_marked(&xas, cache->msg_id_next - 1, XA_FREE_MARK);
+		}
  		if (xas.xa_node == XAS_RESTART)
  			xas_set_err(&xas, -EBUSY);
+
  		xas_store(&xas, req);
-		xas_clear_mark(&xas, XA_FREE_MARK);
-		xas_set_mark(&xas, CACHEFILES_REQ_NEW);
+		if (xas_valid(&xas)) {
+			cache->msg_id_next = xas.xa_index + 1;
If you have a long-standing stuck request, could this counter wrap
around and you still end up with reuse?
Yes, msg_id_next is declared to be of type u32 in the hope that when
xa_index == UINT_MAX, a wrap around occurs so that msg_id_next
goes to zero. Limiting xa_index to no more than UINT_MAX is to avoid
the xarry being too deep.

If msg_id_next is equal to the id of a long-standing stuck request
after the wrap-around, it is true that the reuse in the above problem
may also occur.

But I feel that a long stuck request is problematic in itself, it means
that after we have sent 4294967295 requests, the first one has not
been processed yet, and even if we send a million requests per
second, this one hasn't been completed for more than an hour.

We have a keep-alive process that pulls the daemon back up as
soon as it exits, and there is a timeout mechanism for requests in
the daemon to prevent the kernel from waiting for long periods
of time. In other words, we should avoid the situation where
a request is stuck for a long period of time.

If you think UINT_MAX is not enough, perhaps we could raise
the maximum value of msg_id_next to ULONG_MAX?
Maybe this should be using
ida_alloc/free instead, which would prevent that too?

The id reuse here is that the kernel has finished the open request
req_A and freed its id_A and used it again when sending the open
request req_B, but the daemon is still working on req_A, so the
copen id_A succeeds but operates on req_B.

The id that is being used by the kernel will not be allocated here
so it seems that ida _alloc/free does not prevent reuse either,
could you elaborate a bit more how this works?


+			xas_clear_mark(&xas, XA_FREE_MARK);
+			xas_set_mark(&xas, CACHEFILES_REQ_NEW);
+		}
  		xas_unlock(&xas);
  	} while (xas_nomem(&xas, GFP_KERNEL));

Thanks again!

--
With Best Regards,
Baokun Li





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux