On Mon, Jun 24, 2019 at 3:41 AM Jan Kara <jack@xxxxxxx> wrote: > > On Mon 24-06-19 09:25:00, Miklos Szeredi wrote: > > [cc: vivek, stefan, dgilbert] > > > > On Fri, Jun 21, 2019 at 12:04 AM Liu Bo <obuil.liubo@xxxxxxxxx> wrote: > > > > > > On Thu, Jun 20, 2019 at 1:36 AM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > [added some relevant lists to CC - this can safe some people debugging by > > > > being able to google this discussion] > > > > > > > > On Wed 19-06-19 15:57:38, Liu Bo wrote: > > > > > I found a weird dead loop within invalidate_inode_pages2_range, the > > > > > reason being that pagevec_lookup_entries(index=1) returns an indices > > > > > array which has only one entry storing value 0, and this has led > > > > > invalidate_inode_pages2_range() to a dead loop, something like, > > > > > > > > > > invalidate_inode_pages2_range() > > > > > -> while (pagevec_lookup_entries(index=1, indices)) > > > > > -> for (i = 0; i < pagevec_count(&pvec); i++) { > > > > > -> index = indices[0]; // index is set to 0 > > > > > -> if (radix_tree_exceptional_entry(page)) { > > > > > -> if (!invalidate_exceptional_entry2()) // > > > > > ->__dax_invalidate_mapping_entry // return 0 > > > > > -> // entry marked as PAGECACHE_TAG_DIRTY/TOWRITE > > > > > ret = -EBUSY; > > > > > ->continue; > > > > > } // end of if (radix_tree_exceptional_entry(page)) > > > > > -> index++; // index is set to 1 > > > > > > > > > > The following debug[1] proved the above analysis, I was wondering if > > > > > this was a corner case that pagevec_lookup_entries() allows or a > > > > > known bug that has been fixed upstream? > > > > > > > > > > ps: the kernel in use is 4.19.30 (LTS). > > > > > > > > Hum, the above trace suggests you are using DAX. Are you really? Because the > > > > stacktrace below shows we are working on fuse inode so that shouldn't > > > > really be DAX inode... > > > > > > > > > > So I was running tests against virtiofs[1] which adds dax support to > > > fuse, with dax, fuse provides posix stuff while dax provides data > > > channel. > > > > > > [1]: https://virtio-fs.gitlab.io/ > > > https://gitlab.com/virtio-fs/linux > > OK, thanks for the explanation and the pointer. So if I should guess, I'd > say that there's some problem with multiorder entries (for PMD pages) in > the radix tree. In particular if you lookup index 1 and there's > multiorder entry for indices 0-511, radix_tree_next_chunk() is updating > iter->index like: > > iter->index = (index &~ node_maxindex(node)) | (offset << node->shift); > > and offset is computed by radix_tree_descend() as: > > offset = (index >> parent->shift) & RADIX_TREE_MAP_MASK; > > So this all results in iter->index being set to 0 and thus confusing the > iteration in invalidate_inode_pages2_range(). Current kernel has xarray > code from Matthew which maintains originally passed index in xas.xa_index > and thus the problem isn't there. > > So to sum up: Seems like a DAX-specific bug with PMD entries in older > kernels fixed by xarray rewrite. > Thank you so much for the information, Jan. I'll double check if that's the root cause and report back, if yes, guess then we have to fix 4.19's radix tree in place to do the right thing instead of porting back xarray rewrite.. thanks, liubo > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR