Re: [GIT PULL] io_uring updates for 5.18-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 1, 2022 at 11:34 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> But as a first step, let's just mark it deprecated with a pr_warn() for
> 5.20 and then plan to kill it off whenever a suitable amount of relases
> have passed since that addition.

I'd love to, but it's not actually realistic as things stand now.
epoll() is used in a *lot* of random libraries. A "pr_warn()" would
just be senseless noise, I bet.

No, there's a reason that EPOLL is still there, still 'default y',
even though I dislike it and think it was a mistake, and we've had
several nasty bugs related to it over the years.

It really can be a very useful system call, it's just that it really
doesn't work the way the actual ->poll() interface was designed, and
it kind of hijacks it in ways that mostly work, but the have subtle
lifetime issues that you don't see with a regular select/poll because
those will always tear down the wait queues.

Realistically, the proper fix to epoll is likely to make it explicit,
and make files and drivers that want to support it have to actually
opt in. Because a lot of the problems have been due to epoll() looking
*exactly* like a regular poll/select to a driver or a filesystem, but
having those very subtle extended requirements.

(And no, the extended requirements aren't generally onerous, and
regular ->poll() works fine for 99% of all cases. It's just that
occasionally, special users are then fooled about special contexts).

In other words, it's a bit like our bad old days when "splice()" ended
up falling back to regular ->read()/->write() implementations with
set_fs(KERNEL_DS). Yes, that worked fine for 99% of all cases, and we
did it for years, but it also caused several really nasty issues for
when the read/write actor did something slightly unusual.

So I may dislike epoll quite intensely, but I don't think we can
*really* get rid of it. But we might be able to make it a bit more
controlled.

But so far every time it has caused issues, we've worked around it by
fixing it up in the particular driver or whatever that ended up being
triggered by epoll semantics.

                Linus



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux