On Fri, Feb 17, 2023 at 6:56 AM Daire Byrne <daire@xxxxxxxx> wrote: > > Hi, > > Maybe someone here can quickly point me in the right direction for > this oddity that we noticed. > > On newer kernels, it looks like the task io accounting is not > incrementing the read_bytes when reading from a NFS mount? This was > definitely working on v5.16 downwards, but has not been working since > v5.18 up to v6.2 (I haven't tested v5.17 yet). > > If I read from a local filesystem, then the read_bytes for that PID is > incremented as expected. > > If I read over NFS using directIO, then the read_bytes is also > correctly incremented for that PID. It's just when reading normally > without directIO that it is not. > > The write_bytes and rchar are also still both correct in all situations. > > I have checked the kernel config and I'm fairly sure I have all the > right things enabled: > > CONFIG_TASK_XACCT=y > CONFIG_TASK_IO_ACCOUNTING=y > CONFIG_TASKSTATS=y > > Unless there was some extra config introduced specific to the nfs > client in later kernels that I missed? > > Cheers, > > Daire > Daire, Thanks for the report. Willy, Question for you at the bottom of this. First, here's what looks to be the candidate changes between these versions: $ git log --oneline v5.16..v5.18 fs/nfs/read.c 89c2be8a9516 NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS fc1c5abfca7e NFS: Rename fscache read and write pages functions 8786fde8421c Convert NFS from readpages to readahead 16f2f4e679cf nfs: Implement cache I/O by accessing the cache directly I would be this is due to this patch, which went into 5.18: 8786fde8421c Convert NFS from readpages to readahead And the hunks that now call readahead_page vs read_cache_pages: @@ -397,14 +396,16 @@ int nfs_readpage(struct file *file, struct page *page) return ret; } -int nfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +void nfs_readahead(struct readahead_control *ractl) { + unsigned int nr_pages = readahead_count(ractl); + struct file *file = ractl->file; struct nfs_readdesc desc; - struct inode *inode = mapping->host; + struct inode *inode = ractl->mapping->host; + struct page *page; int ret; - trace_nfs_aop_readahead(inode, lru_to_page(pages), nr_pages); + trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages); nfs_inc_stats(inode, NFSIOS_VFSREADPAGES); ret = -ESTALE; @@ -422,14 +423,18 @@ int nfs_readpages(struct file *file, struct address_space *mapping, nfs_pageio_init_read(&desc.pgio, inode, false, &nfs_async_read_completion_ops); - ret = read_cache_pages(mapping, pages, readpage_async_filler, &desc); + while ((page = readahead_page(ractl)) != NULL) { + ret = readpage_async_filler(&desc, page); + put_page(page); + if (ret) + break; + } nfs_pageio_complete_read(&desc.pgio); put_nfs_open_context(desc.ctx); out: trace_nfs_aop_readahead_done(inode, nr_pages, ret); - return ret; } and this hunk: @@ -397,14 +396,16 @@ int nfs_readpage(struct file *file, struct page *page) return ret; } -int nfs_readpages(struct file *file, struct address_space *mapping, - struct list_head *pages, unsigned nr_pages) +void nfs_readahead(struct readahead_control *ractl) { + unsigned int nr_pages = readahead_count(ractl); + struct file *file = ractl->file; struct nfs_readdesc desc; - struct inode *inode = mapping->host; + struct inode *inode = ractl->mapping->host; + struct page *page; int ret; - trace_nfs_aop_readahead(inode, lru_to_page(pages), nr_pages); + trace_nfs_aop_readahead(inode, readahead_pos(ractl), nr_pages); nfs_inc_stats(inode, NFSIOS_VFSREADPAGES); ret = -ESTALE; @@ -422,14 +423,18 @@ int nfs_readpages(struct file *file, struct address_space *mapping, nfs_pageio_init_read(&desc.pgio, inode, false, &nfs_async_read_completion_ops); - ret = read_cache_pages(mapping, pages, readpage_async_filler, &desc); + while ((page = readahead_page(ractl)) != NULL) { + ret = readpage_async_filler(&desc, page); + put_page(page); + if (ret) + break; + } nfs_pageio_complete_read(&desc.pgio); put_nfs_open_context(desc.ctx); out: trace_nfs_aop_readahead_done(inode, nr_pages, ret); - return ret; } int __init nfs_init_readpagecache(void) In v5.16 we had this call to task_io_account_read(PAGE_SIZE); on line 109 of read_cache_pages(); 76 /** 77 * read_cache_pages - populate an address space with some pages & start reads against them 78 * @mapping: the address_space 79 * @pages: The address of a list_head which contains the target pages. These 80 * pages have their ->index populated and are otherwise uninitialised. 81 * @filler: callback routine for filling a single page. 82 * @data: private data for the callback routine. 83 * 84 * Hides the details of the LRU cache etc from the filesystems. 85 * 86 * Returns: %0 on success, error return by @filler otherwise 87 */ 88 int read_cache_pages(struct address_space *mapping, struct list_head *pages, 89 int (*filler)(void *, struct page *), void *data) 90 { 91 struct page *page; 92 int ret = 0; 93 94 while (!list_empty(pages)) { 95 page = lru_to_page(pages); 96 list_del(&page->lru); 97 if (add_to_page_cache_lru(page, mapping, page->index, 98 readahead_gfp_mask(mapping))) { 99 read_cache_pages_invalidate_page(mapping, page); 100 continue; 101 } 102 put_page(page); 103 104 ret = filler(data, page); 105 if (unlikely(ret)) { 106 read_cache_pages_invalidate_pages(mapping, pages); 107 break; 108 } 109 task_io_account_read(PAGE_SIZE); But there's no call to task_io_account_read() anymore in the new readahead code paths that I could tell, but maybe I'm missing something. Willy, Does each caller of readahead_page() now need to call task_io_account_read() or should we add that into readahead_page() or maybe inside read_pages()?