Re: [LST/MM TOPIC] really non-blocking in aio stack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(dropping lsf-pc from the follow-on discussion for fsdevel)

Although we have define EIOCBRETRY, it seems to me that it is not used
as perfect as it can.

EIOCBRETRY is a disaster because the operations are retried in the
context of the kaio threads.  To use it safely you have to ensure that
nothing the operation will do after returning -EIOCBRETRY will reference
current-> .

Realize that this can include convoluted paths through shared code that
might have *no idea* that they're used by some other path after
EIOCBRETRY and so have to be supernaturally careful with current->
references.  It's a maintenance nightmare.

The fs/aio.c retry code has the aio thread magically assume the mm
context of the submitting thread when it calls the retry handlers.
(aio_kick_handler()).  So, great, that's one current field that happens
to be sharable.  How about others?  current->journal_info?
current->io_context?  People sometimes ask about EIOCBRETRY and vfs ops
and never mention current->link_count.

As one of the people who has sunk serious time into fs/aio.c (cc:ing my
erstwhile partner in crime), I strongly discourage investing more
resources into the fs/aio.c design.  If it were me I'd be putting
resources into async infrastructure which makes use of the current
existing sync system call handling paths.

Async calls should have no idea that they're async: no duplication of
the syscall abi in submission argument structs, no magical fget before
calling operation handlers, no iocbs being sprinkled down through kernel
call stacks, no magical return codes.

Yeah, this ends up implying heavy use of kernel threads and playing
scary games with the task_struct of the submitter and async processing
thread.  At least the scary code would be in one place.

The current alternative of requiring fragile async implementations of
system calls has a compelling history of failure. fs/aio.c has been
around for a decade and has not seen significant use outside of its
initial supported operation.

I should really get the ogg of my LCA presentation (more of a jet-lagged
rant :)) on this posted somewhere.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux