Re: Direct I/O performance problems with 1GB pages

David Hildenbrand <david@xxxxxxxxxx> · Mon, 27 Jan 2025 17:09:57 +0100

If the workload doing a lot of single-page try_grab_folio_fast(), could it
do so on a larger area (multiple pages at once -> single refcount update)?

Not really.  This is memory that's being used as the buffer cache, so
every thread in your database is hammering on it and pulling in exactly
the data that it needs for the SQL query that it's processing.

Ouch.

Maybe there is a link to the report you could share, thanks.

Andres shared some gists, but I don't want to send those to a
mailing list without permission.  Here's the kernel part of the
perf report:

     14.04%  postgres         [kernel.kallsyms]          [k] try_grab_folio_fast
             |
              --14.04%--try_grab_folio_fast
                        gup_fast_fallback
                        |
                         --13.85%--iov_iter_extract_pages
                                   bio_iov_iter_get_pages
                                   iomap_dio_bio_iter
                                   __iomap_dio_rw
                                   iomap_dio_rw
                                   xfs_file_dio_read
                                   xfs_file_read_iter
                                   __io_read
                                   io_read
                                   io_issue_sqe
                                   io_submit_sqes
                                   __do_sys_io_uring_enter
                                   do_syscall_64

Now, since postgres is using io_uring, perhaps there could be a path
which registers the memory with the iouring (doing the refcount/pincount
dance once), and then use that pinned memory for each I/O.  Maybe that
already exists; I'm not keeping up with io_uring development and I can't
seem to find any documentation on what things like io_provide_buffers()
actually do.

That's precisely what io-uring fixed buffers do :)

--
Cheers,

David / dhildenb