On 15/12/2020 01:33, Dave Chinner wrote: > On Tue, Dec 15, 2020 at 01:03:45AM +0000, Pavel Begunkov wrote: >> On 15/12/2020 00:56, Dave Chinner wrote: >>> On Tue, Dec 15, 2020 at 12:20:23AM +0000, Pavel Begunkov wrote: >>>> As reported, we must not do pressure stall information accounting for >>>> direct IO, because otherwise it tells that it's thrashing a page when >>>> actually doing IO on hot data. >>>> >>>> Apparently, bio_iov_iter_get_pages() is used only by paths doing direct >>>> IO, so just make it avoid setting BIO_WORKINGSET, it also saves us CPU >>>> cycles on doing that. For fs/direct-io.c just clear the flag before >>>> submit_bio(), it's not of much concern performance-wise. >>>> >>>> Reported-by: Christoph Hellwig <hch@xxxxxxxxxxxxx> >>>> Suggested-by: Christoph Hellwig <hch@xxxxxxxxxxxxx> >>>> Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx> >>>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> >>>> --- >>>> block/bio.c | 25 ++++++++++++++++--------- >>>> fs/direct-io.c | 2 ++ >>>> 2 files changed, 18 insertions(+), 9 deletions(-) >>> ..... >>>> @@ -1099,6 +1103,9 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) >>>> * fit into the bio, or are requested in @iter, whatever is smaller. If >>>> * MM encounters an error pinning the requested pages, it stops. Error >>>> * is returned only if 0 pages could be pinned. >>>> + * >>>> + * It also doesn't set BIO_WORKINGSET, so is intended for direct IO. If used >>>> + * otherwise the caller is responsible to do that to keep PSI happy. >>>> */ >>>> int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >>>> { >>>> diff --git a/fs/direct-io.c b/fs/direct-io.c >>>> index d53fa92a1ab6..914a7f600ecd 100644 >>>> --- a/fs/direct-io.c >>>> +++ b/fs/direct-io.c >>>> @@ -426,6 +426,8 @@ static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio) >>>> unsigned long flags; >>>> >>>> bio->bi_private = dio; >>>> + /* PSI is only for paging IO */ >>>> + bio_clear_flag(bio, BIO_WORKINGSET); >>> >>> Why only do this for the old direct IO path? Why isn't this >>> necessary for the iomap DIO path? >> >> It's in the description. In short, block and iomap dio use >> bio_iov_iter_get_pages(), which with this patch doesn't use >> [__]bio_add_page() and so doesn't set the flag. > > That is not obvious to someone not intimately familiar with the > patchset you are working on. You described -what- the code is doing, > not -why- the flag needs to be cleared here. It's missing the link between BIO_WORKINGSET and PSI, but otherwise it describe both, what it does and how. I'll reword it for you next iteration. > > "Direct IO does not operate on the current working set of pages > managed by the kernel, so it should not be accounted as IO to the > pressure stall tracking infrastructure. Only direct IO paths use > bio_iov_iter_get_pages() to build bios, so to avoid PSI tracking of > direct IO don't flag the bio with BIO_WORKINGSET in this function. > > fs/direct-io.c uses <some other function> to build the bio we > are going to submit and so still flags the bio with BIO_WORKINGSET. > Rather than convert it to use bio_iov_iter_get_pages() to avoid > flagging the bio, we simply clear the BIO_WORKINGSET flag before > submitting the bio." > > Cheers, > > Dave. > -- Pavel Begunkov