Re: CLONE_IO documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jens,

Following up after a long time on this:

On Mon, Apr 14, 2008 at 12:13 PM, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> On Mon, Apr 14 2008, Michael Kerrisk wrote:
>> Hi Jens,
>>
>> Could you supply some text describing CLONE_IO suitable for inclusion
>> in the clone.2 man page?
>> ( http://www.kernel.org/doc/man-pages/online/pages/man2/clone.2.html
>> ).  In that text it would be helpful to explain what an "I/O context"
>> is.
>
> Sure, I'll see if I can come up with something. Or perhaps you can help
> me a bit, being the writer ;-)
>
> If the CLONE_IO flag is set, the process will share the same io context.
> The I/O context is the I/O scope of the disk scheduler. So if you think
> of the I/O context as what the I/O scheduler uses to map to a process,
> when CLONE_IO is set multiple processes will map to the same I/O context
> and will be treated as one by the I/O scheduler. What this means is that
> they get to share disk time. For the anticipatory and CFQ scheduler, if
> process A and process B share I/O context, they will be allowed to
> interleave their disk access. So if you have several threads doing I/O
> on behalf of the same process (aio_read(), for instance), they should
> set CLONE_IO to get better I/O performance with CFQ and AS.
>
> A man page should not mention the specific schedulers, just mention that
> it'll improve the information available to the kernel and the
> performance of the app for the scenario described. In practice, it'll
> only really apply to CFQ and AS. For deadline and noop, they'll be
> essentially zero difference as they have no concept of I/O contexts.

I took your text as a base but did some reworking, so *please check
the following carefully*,  and let me know if there are things to
change and/or add:

       CLONE_IO (since Linux 2.4.25)
              If  CLONE_IO  is set, then the new process shares an I/O
              context with the calling process.  If this flag  is  not
              set,  then (as with fork(2)) the new process has its own
              I/O context.

              The I/O context is the I/O scope of the  disk  scheduler
              (i.e, what the I/O scheduler uses to model scheduling of
              a process's I/O).  If processes share the same I/O  con-
              text,  they are treated as one by the I/O scheduler.  As
              a consequence, they get to share disk  time.   For  some
              I/O  schedulers,  if two processes share an I/O context,
              they will be allowed to interleave  their  disk  access.
              If  several  threads are doing I/O on behalf of the same
              process (aio_read(3), for instance), they should  employ
              CLONE_IO to get better I/O performance.

              If  the  kernel  is not configured with the CONFIG_BLOCK
              option, this flag is a no-op.

The patch against clone.2 is below.

Thanks,

Michael


--- a/man2/clone.2
+++ b/man2/clone.2
@@ -224,6 +223,36 @@ Calls to
 .BR umask (2)
 performed later by one of the processes do not affect the other process.
 .TP
+.BR CLONE_IO " (since Linux 2.4.25)"
+If
+.B CLONE_IO
+is set, then the new process shares an I/O context with
+the calling process.
+If this flag is not set, then (as with
+.BR fork (2))
+the new process has its own I/O context.
+
+.\" The following based on text from Jens Axboe
+The I/O context is the I/O scope of the disk scheduler (i.e,
+what the I/O scheduler uses to model scheduling of a process's I/O).
+If processes share the same I/O context,
+they are treated as one by the I/O scheduler.
+As a consequence, they get to share disk time.
+For some I/O schedulers,
+.\" the anticipatory and CFQ scheduler
+if two processes share an I/O context,
+they will be allowed to interleave their disk access.
+If several threads are doing I/O on behalf of the same process
+.RB ( aio_read (3),
+for instance), they should employ
+.BR CLONE_IO
+to get better I/O performance.
+.\" with CFQ and AS.
+
+If the kernel is not configured with the
+.B CONFIG_BLOCK
+option, this flag is a no-op.
+.TP
 .BR CLONE_NEWIPC " (since Linux 2.4.19)"
 If
 .B CLONE_NEWIPC
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux