On 11/14/23 11:10 PM, Xiaobing Li wrote: > On 11/15/23 2:36 AM, Jens Axboe wrote: >> if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { >> struct io_sq_data *sq = ctx->sq_data; >> >> - if (mutex_trylock(&sq->lock)) { >> - if (sq->thread) { >> - sq_pid = task_pid_nr(sq->thread); >> - sq_cpu = task_cpu(sq->thread); >> - } >> - mutex_unlock(&sq->lock); >> - } >> + sq_pid = sq->task_pid; >> + sq_cpu = sq->sq_cpu; >> } > > There are two problems: > 1.The output of SqThread is inaccurate. What is actually recorded is > the PID of the parent process. Doh yes, we need to reset this at the start of the thread, post assigning task_comm. I'll send out a v4 today. > 2. Sometimes it can output, sometimes it outputs -1. > > The test results are as follows: > Every 0.5s: cat /proc/9572/fdinfo/6 | grep Sq > SqMask: 0x3 > SqHead: 6765744 > SqTail: 6765744 > CachedSqHead: 6765744 > SqThread: -1 > SqThreadCpu: -1 > SqBusy: 0% > ------------------------------------------- > Every 0.5s: cat /proc/9572/fdinfo/6 | grep Sq > SqMask: 0x3 > SqHead: 7348727 > SqTail: 7348728 > CachedSqHead: 7348728 > SqThread: 9571 > SqThreadCpu: 174 > SqBusy: 95% Right, this is due to the uring_lock. We got rid of the main regression, which was the new trylock for the sqd->lock, but the old one remains. We can fix this as well for sqpoll info, but it's not a regression from past releases, it's always been like that. Pavel and I discussed it yesterday, and the easy solution is to make io_sq_data be under RCU protection. But that requires this patch first, so we don't have to fiddle with the sqpoll task itself. I can try and hack up the patch if you want to test it, it'd be on top of this one and for the next kernel release rather than 6.7. -- Jens Axboe