On 4/24/24 11:39 AM, libaokun@xxxxxxxxxxxxxxx wrote: > From: Baokun Li <libaokun1@xxxxxxxxxx> > > The following concurrency may cause a read request to fail to be completed > and result in a hung: > > t1 | t2 > --------------------------------------------------------- > cachefiles_ondemand_copen > req = xa_erase(&cache->reqs, id) > // Anon fd is maliciously closed. > cachefiles_ondemand_fd_release > xa_lock(&cache->reqs) > cachefiles_ondemand_set_object_close(object) > xa_unlock(&cache->reqs) > cachefiles_ondemand_set_object_open > // No one will ever close it again. > cachefiles_ondemand_daemon_read > cachefiles_ondemand_select_req > // Get a read req but its fd is already closed. > // The daemon can't issue a cread ioctl with an closed fd, then hung. > > So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and > state, thus we can avoid the above problem in cachefiles_ondemand_copen() > by using ondemand_id to determine if fd has been released. > > Signed-off-by: Baokun Li <libaokun1@xxxxxxxxxx> This indeed looks like a reasonable scenario where the kernel side should fix, as a non-malicious daemon could also run into this. How about reusing &cache->reqs spinlock rather than introducing a new spinlock, as &cache->reqs spinlock is already held when setting object to close state in cachefiles_ondemand_fd_release()? > --- > fs/cachefiles/internal.h | 1 + > fs/cachefiles/ondemand.c | 16 +++++++++++++++- > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h > index 7745b8abc3aa..45c8bed60538 100644 > --- a/fs/cachefiles/internal.h > +++ b/fs/cachefiles/internal.h > @@ -55,6 +55,7 @@ struct cachefiles_ondemand_info { > int ondemand_id; > enum cachefiles_object_state state; > struct cachefiles_object *object; > + spinlock_t lock; > }; > > /* > diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c > index 898fab68332b..b5e6a851ef04 100644 > --- a/fs/cachefiles/ondemand.c > +++ b/fs/cachefiles/ondemand.c > @@ -16,13 +16,16 @@ static int cachefiles_ondemand_fd_release(struct inode *inode, > struct cachefiles_object *object = file->private_data; > struct cachefiles_cache *cache = object->volume->cache; > struct cachefiles_ondemand_info *info = object->ondemand; > - int object_id = info->ondemand_id; > + int object_id; > struct cachefiles_req *req; > XA_STATE(xas, &cache->reqs, 0); > > xa_lock(&cache->reqs); > + spin_lock(&info->lock); > + object_id = info->ondemand_id; > info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED; > cachefiles_ondemand_set_object_close(object); > + spin_unlock(&info->lock); > > /* Only flush CACHEFILES_REQ_NEW marked req to avoid race with daemon_read */ > xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) { > @@ -127,6 +130,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) > { > struct cachefiles_req *req; > struct fscache_cookie *cookie; > + struct cachefiles_ondemand_info *info; > char *pid, *psize; > unsigned long id; > long size; > @@ -185,6 +189,14 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) > goto out; > } > > + info = req->object->ondemand; > + spin_lock(&info->lock); > + /* The anonymous fd was closed before copen ? */ I would like describe more details in the comment, e.g. put the time sequence described in the commit message here. > + if (info->ondemand_id == CACHEFILES_ONDEMAND_ID_CLOSED) { > + spin_unlock(&info->lock); > + req->error = -EBADFD; > + goto out; > + } > cookie = req->object->cookie; > cookie->object_size = size; > if (size) > @@ -194,6 +206,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args) > trace_cachefiles_ondemand_copen(req->object, id, size); > > cachefiles_ondemand_set_object_open(req->object); > + spin_unlock(&info->lock); > wake_up_all(&cache->daemon_pollwq); > > out: > @@ -596,6 +609,7 @@ int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object, > return -ENOMEM; > > object->ondemand->object = object; > + spin_lock_init(&object->ondemand->lock); > INIT_WORK(&object->ondemand->ondemand_work, ondemand_object_worker); > return 0; > } -- Thanks, Jingbo