Re: [GIT PULL] io_uring followup fixes for 5.12-rc4

Jens Axboe <axboe@xxxxxxxxx> · Sun, 21 Mar 2021 14:15:17 -0600

On 3/21/21 1:57 PM, Linus Torvalds wrote:
> On Sun, Mar 21, 2021 at 9:38 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> - Catch and loop when needing to run task_work before a PF_IO_WORKER
>>   threads goes to sleep.
> 
> Hmm. The patch looks fine, but it makes me wonder: why does that code
> use test_tsk_thread_flag() and clear_tsk_thread_flag() on current?
> 
> It should just use test_thread_flag() and clear_thread_flag().
> 
> Now it looks up "current" - which goes through the thread info - and
> then looks up the thread from that. It's all kinds of stupid.
> 
> It should just have used the thread_info from the beginning, which is
> what test_thread_flag() and clear_thread_flag() do.
> 
> I see the same broken pattern in both fs/io-wq.c (which is where I
> noticed it when looking at the patch) and in fs/io-uring.c.
> 
> Please don't do "*_tsk_thread_flag(current, x)", when just
> "*_thread_flag(x)" is simpler, and more efficient.
> 
> In fact, you should avoid *_tsk_thread_flag() as much as possible in general.
> 
> Thread flags should be considered mostly private to that thread - the
> exceptions are generally some very low-level system stuff, ie core
> signal handling and things like that.
> 
> So please change things like
> 
>         if (test_tsk_thread_flag(current, TIF_NOTIFY_SIGNAL))
> 
> to
> 
>         if (test_thread_flag(TIF_NOTIFY_SIGNAL))
> 
> etc.
> 
> And yes, we have a design mistake in a closely related area:
> "signal_pending()" should *not* take the task pointer either, and we
> should have the "current thread" separate from "another thread".
> 
> Maybe the "signal_pending(current)" makes people think it's a good
> idea to pass in "current" to the thread flag checkers. We would have
> been better off with "{fatal_,}signal_pending(void)" for the current
> task, and "tsk_(fatal_,}signal_pending(tsk)" for the (very few) cases
> of checking another task.
> 
> Because it really is all kinds of stupid (yes, often historical -
> going all the way back to when 'current' was the main model - but now
> stupid) to look up "current" to then look up thread data, when these
> days, when the basic pattern is
> 
>   #define current get_current()
>   #define get_current() (current_thread_info()->task)
> 
> ioe, the *thread_info* is the primary and quick thing, and "current"
> is the indirection, and so if you see code that basically does
> "task_thread_info()" on "current", it is literally going back and
> forth between the two.
> 
> And yes, on architectures that use "THREAD_INFO_IN_TASK" (which does
> include x86), the back-and-forth ends up being a non-issue (because
> it's just offsets into containing structs) and it doesn't really
> matter. But conceptually, patterns like "test_tsk_thread_flag(current,
> x)" really are wrong, and on some architectures it generates
> potentially *much* worse code.

Thanks, that's useful information, I guess it just ended up being used
by chance and I didn't realize it made a difference for some archs. I'll
change these, and I also think that io-wq should be a bit nicer and use
tracehook_notify_signal() if TIF_NOTIFY_SIGNAL is set. Doesn't matter
now, but very well might in the future when TIF_NOTIFY_SIGNAL gets
used for more than just task_work notifications.

-- 
Jens Axboe