Re: [GIT PULL] io_uring updates for 5.14-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 1, 2021 at 3:06 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 29, 2021 at 1:43 PM Jens Axboe <axboe@xxxxxxxxx> wrote:
> >
> > - Support for mkdirat, symlinkat, and linkat for io_uring (Dmitry)
>
> I pulled this, and then I unpulled it again.

First of all, I totally agree that parts of it are ugly. And I'm happy
to work on improving it with some guidance. But it's been sort of hard
getting feedback on the namei part of things (Christian Brauner was very
supportive though), and in cases where I was not sure what's the best
way to do something I just went with the "pick the least ugly way I can
see at the moment and collect the feedback" approach. And yes, in some
cases "least ugly" is still very ugly.

Part of the issue is I'm very new to the kernel code, I've seen the
unlinkat / renameat changes and thought that surely I can do the same
for mkdirat. It turned out to be much larger and not just about mkdirat
in the end.

> I hate how makes the rules for when "putname()" is called completely
> arbitrary and very confusing.

In my opinion, this is because of the "consume the name on error, but
not on success" logic in __filename_create / __filename_lookup. This
behavior was in fact suggested by Al back in January:
https://lore.kernel.org/io-uring/20210201150042.GQ740243@zeniv-ca/

And it works reasonably well when there is just one struct filename
passed in. But when there are two the state potentially becomes very
confusing: we might end up with cases like the second filename was
freed, but the first was not (which is basically where the hacks below
live).

> It ends up with multiple cases of something like
>
>         error = -ENOENT;
>         goto out_putnames;
>
> that didn't exist before.

Before it was just "return -ENOENT". But now there are two filenames
passed in that the function is responsible for freeing. I'm not sure how
this can be avoided.

> And worse still ends up being that unbelievably ugly hack with
>
>         // On error `new` is freed by __filename_create, prevent extra freeing
>         // below
>         new = ERR_PTR(error);
>         goto out_putpath;
>
> that ends up intentionally undoing one of the putnames because the
> name has already been used.

Yes, this is ugly, and teaching putname to deal with NULLs would mean we
could just set it to NULL here, but we can't just set it to NULL when it
was used, see below.

> And none of the commits have acks by Al. I realize that he can
> sometimes be a bit unresponsive, but this is just *UGLY*. And we've
> had too many io_uring issues for me to just say "I'm sure it's fine".
>
> I can see a few ways to at least de-uglify things:
>
>  - Maybe we can make putname() just do nothing for IS_ERR_OR_NULL() names.
>
>    We have that kind of rules for a number of path walking things,
> where passing in an error pointer is fine. Things like
> link_path_walk() or filename_lookup() act that way very much by
> design, exactly to make it easy to handle error conditions.

This sounds great to me, will make some paths much cleaner. But will
help with "new = ERR_PTR(error);" only partially (by using NULL instead
of ERR_PTR(error)).

>  - callers of __filename_create() and similar thar eat the name (and
> return a dentry or whatever) could then set the name to NULL, not as
> part of the error handling, but unconditionally as a "it's been used".

The problem is we have to keep the filenames around for retries on
ESTALE. It's not consumed by __filename_create() on success. So it's not
as simple as setting the name to NULL after calling __filename_create().
If it was not for ESTALE it'd be possible just to use filename_create()
that consumes the name passed to it unconditionally and it'd make the
logic much simpler indeed.

I'll do my best to improve things, but if there are any suggestions
they'd be appreciated.

-- 
Dmitry Kadashev



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux