On 2024/2/21 13:16, Dmitry Antipov wrote:
I've tracked https://syzkaller.appspot.com/bug?extid=5f1acda7e06a2298fae6
down to the problem which may be illustrated by the following pseudocode:
int sock;
/* thread 1 */
while (1) {
struct msghdr msg = { ... };
sock = socket(AF_SMC, SOCK_STREAM, 0);
sendmsg(sock, &msg, MSG_FASTOPEN);
close(sock);
}
/* thread 2 */
while (1) {
int on = 1;
ioctl(sock, FIOASYNC, &on);
on = 0;
ioctl(sock, FIOASYNC, &on);
}
That is, something in thread 1 may cause 'smc_switch_to_fallback()' and
swap kernel sockets (of 'struct smc_sock') behind 'sock' between 'ioctl()'
calls in thread 2, so this becomes an attempt to add fasync entry to one
socket but remove from another one. When 'sock' is closing, '__fput()'
calls 'f_op->fasync()' _before_ 'f_op->release()', and it's too late to
revert the trick performed by 'smc_switch_to_fallback()' in 'smc_release()'
and below. Finally we end up with leaked 'struct fasync_struct' object
linked to the base socket, and this object is noticed by '__sock_release()'
("fasync list not empty"). Of course using 'fasync_remove_entry()' in such
a way is extremely ugly, but what else we can do without touching generic
socket code, '__fput()', etc.? Comments are highly appreciated.
Hi, Dmitry. Just to confirm if I understand correctly:
1. on = 1; ioctl(sock, FIOASYNC, &on), a fasync entry is added to
smc->sk.sk_socket->wq.fasync_list;
2. Then fallback happend, and swapped the socket:
smc->clcsock->file = smc->sk.sk_socket->file;
smc->clcsock->file->private_data = smc->clcsock;
smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list;
smc->sk.sk_socket->wq.fasync_list = NULL;
3. on = 0; ioctl(sock, FIOASYNC, &on), the fasync entry is removed
from smc->clcsock->wq.fasync_list,
(Is there a race between 2 and 3 ?)
4. Then close the file, __fput() calls file->f_op->fasync(-1, file, 0),
then sock_fasync() calls fasync_helper(fd, filp, on, &wq->fasync_list)
and fasync_remove_entry() removes entries in smc->clcsock->wq.fasync_list.
Now smc->clcsock->wq.fasync_list is empty.
5. __fput() calls file->f_op->release(inode, file), then sock_close calls
__sock_release, then ops->release calls smc_release(), and __smc_release()
calls smc_restore_fallback_changes() to restore socket:
if (smc->clcsock->file) { /* non-accepted sockets have no file yet */
smc->clcsock->file->private_data = smc->sk.sk_socket;
smc->clcsock->file = NULL;
smc_fback_restore_callbacks(smc);
}
6. Then back to __sock_release, check if sock->wq.fasync_list (that is
smc->sk.sk_socket->wq.fasync_list) is empty and it is empty.
So in which step we leaked the fasync_struct entry in smc->sk.sk_socket->wq.fasync_list?
Looks like I missed something, could you please point it to me?
Thanks!
Signed-off-by: Dmitry Antipov <dmantipov@xxxxxxxxx>
---
net/smc/af_smc.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 0f53a5c6fd9d..68cde9db5d2f 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -337,9 +337,13 @@ static int smc_release(struct socket *sock)
else
lock_sock(sk);
- if (old_state == SMC_INIT && sk->sk_state == SMC_ACTIVE &&
- !smc->use_fallback)
+ if (smc->use_fallback) {
+ /* FIXME: ugly and should be done in some other way */
+ if (sock->wq.fasync_list)
+ fasync_remove_entry(sock->file, &sock->wq.fasync_list);
+ } else if (old_state == SMC_INIT && sk->sk_state == SMC_ACTIVE) {
smc_close_active_abort(smc);
+ }
rc = __smc_release(smc);