On 2/14/22 7:59 PM, Hao Xu wrote: > 在 2022/2/15 上午1:43, Stefan Roesch 写道: >> This patch series adds support for async buffered writes. Currently >> io-uring only supports buffered writes in the slow path, by processing >> them in the io workers. With this patch series it is now possible to >> support buffered writes in the fast path. To be able to use the fast >> path the required pages must be in the page cache or they can be loaded >> with noio. Otherwise they still get punted to the slow path. >> >> If a buffered write request requires more than one page, it is possible >> that only part of the request can use the fast path, the resst will be >> completed by the io workers. >> >> Support for async buffered writes: >> Patch 1: fs: Add flags parameter to __block_write_begin_int >> Add a flag parameter to the function __block_write_begin_int >> to allow specifying a nowait parameter. >> Patch 2: mm: Introduce do_generic_perform_write >> Introduce a new do_generic_perform_write function. The function >> is split off from the existing generic_perform_write() function. >> It allows to specify an additional flag parameter. This parameter >> is used to specify the nowait flag. >> Patch 3: mm: add noio support in filemap_get_pages >> This allows to allocate pages with noio, if a page for async >> buffered writes is not yet loaded in the page cache. >> Patch 4: mm: Add support for async buffered writes >> For async buffered writes allocate pages without blocking on the >> allocation. >> >> Patch 5: fs: split off __alloc_page_buffers function >> Split off __alloc_page_buffers() function with new gfp_t parameter. >> >> Patch 6: fs: split off __create_empty_buffers function >> Split off __create_empty_buffers() function with new gfp_t parameter. >> >> Patch 7: fs: Add aop_flags parameter to create_page_buffers() >> Add aop_flags to create_page_buffers() function. Use atomic allocation >> for async buffered writes. >> >> Patch 8: fs: add support for async buffered writes >> Return -EAGAIN instead of -ENOMEM for async buffered writes. This >> will cause the write request to be processed by an io worker. >> >> Patch 9: io_uring: add support for async buffered writes >> This enables the async buffered writes for block devices in io_uring. >> Buffered writes are enabled for blocks that are already in the page >> cache or can be acquired with noio. >> >> Patch 10: io_uring: Add tracepoint for short writes >> >> Support for write throttling of async buffered writes: >> Patch 11: sched: add new fields to task_struct >> Add two new fields to the task_struct. These fields store the >> deadline after which writes are no longer throttled. >> >> Patch 12: mm: support write throttling for async buffered writes >> This changes the balance_dirty_pages function to take an additonal >> parameter. When nowait is specified the write throttling code no >> longer waits synchronously for the deadline to expire. Instead >> it sets the fields in task_struct. Once the deadline expires the >> fields are reset. >> Patch 13: io_uring: support write throttling for async buffered writes >> Adds support to io_uring for write throttling. When the writes >> are throttled, the write requests are added to the pending io list. >> Once the write throttling deadline expires, the writes are submitted. >> Enable async buffered write support >> Patch 14: fs: add flag to support async buffered writes >> This sets the flags that enables async buffered writes for block >> devices. >> >> >> Testing: >> This patch has been tested with xfstests and fio. >> >> >> Peformance results: >> For fio the following results have been obtained with a queue depth of >> 1 and 4k block size (runtime 600 secs): >> >> sequential writes: >> without patch with patch >> throughput: 329 Mib/s 1032Mib/s >> iops: 82k 264k >> slat (nsec) 2332 3340 >> clat (nsec) 9017 60 >> CPU util%: 37% 78% >> >> >> >> random writes: >> without patch with patch >> throughput: 307 Mib/s 909Mib/s >> iops: 76k 227k >> slat (nsec) 2419 3780 >> clat (nsec) 9934 59 >> >> CPU util%: 57% 88% >> >> For an io depth of 1, the new patch improves throughput by close to 3 >> times and also the latency is considerably reduced. To achieve the same >> or better performance with the exisiting code an io depth of 4 is required. >> >> Especially for mixed workloads this is a considerable improvement. >> >> >> >> >> Stefan Roesch (14): >> fs: Add flags parameter to __block_write_begin_int >> mm: Introduce do_generic_perform_write >> mm: add noio support in filemap_get_pages >> mm: Add support for async buffered writes >> fs: split off __alloc_page_buffers function >> fs: split off __create_empty_buffers function >> fs: Add aop_flags parameter to create_page_buffers() >> fs: add support for async buffered writes >> io_uring: add support for async buffered writes >> io_uring: Add tracepoint for short writes >> sched: add new fields to task_struct >> mm: support write throttling for async buffered writes >> io_uring: support write throttling for async buffered writes >> block: enable async buffered writes for block devices. >> >> block/fops.c | 5 +- >> fs/buffer.c | 103 ++++++++++++++++--------- >> fs/internal.h | 3 +- >> fs/io_uring.c | 130 +++++++++++++++++++++++++++++--- >> fs/iomap/buffered-io.c | 4 +- >> fs/read_write.c | 3 +- >> include/linux/fs.h | 4 + >> include/linux/sched.h | 3 + >> include/linux/writeback.h | 1 + >> include/trace/events/io_uring.h | 25 ++++++ >> kernel/fork.c | 1 + >> mm/filemap.c | 34 +++++++-- >> mm/folio-compat.c | 4 + >> mm/page-writeback.c | 54 +++++++++---- >> 14 files changed, 298 insertions(+), 76 deletions(-) >> >> >> base-commit: f1baf68e1383f6ed93eb9cff2866d46562607a43 >> > It's a little bit different between buffered read and buffered write, > there may be block points in detail filesystems due to journal > operations for the latter. > This patch series only adds support for async buffered writes for block devices, not filesystems.