On Thu, Jul 18, 2019 at 07:53:42PM -0700, Dan Williams wrote: > [ add Sasha for -stable advice ] > > On Thu, Jul 18, 2019 at 5:13 PM Liu Bo <bo.liu@xxxxxxxxxxxxxxxxx> wrote: > > > > The livelock can be triggerred in the following pattern, > > > > while (index < end && pagevec_lookup_entries(&pvec, mapping, index, > > min(end - index, (pgoff_t)PAGEVEC_SIZE), > > indices)) { > > ... > > for (i = 0; i < pagevec_count(&pvec); i++) { > > index = indices[i]; > > ... > > } > > index++; /* BUG */ > > } > > > > multi order exceptional entry is not specially considered in > > invalidate_inode_pages2_range() and it ended up with a livelock because > > both index 0 and index 1 finds the same pmd, but this pmd is binded to > > index 0, so index is set to 0 again. > > > > This introduces a helper to take the pmd entry's length into account when > > deciding the next index. > > > > Note that there're other users of the above pattern which doesn't need to > > fix, > > > > - dax_layout_busy_page > > It's been fixed in commit d7782145e1ad > > ("filesystem-dax: Fix dax_layout_busy_page() livelock") > > > > - truncate_inode_pages_range > > This won't loop forever since the exceptional entries are immediately > > removed from radix tree after the search. > > > > Fixes: 642261a ("dax: add struct iomap based DAX PMD support") > > Cc: <stable@xxxxxxxxxxxxxxx> since 4.9 to 4.19 > > Signed-off-by: Liu Bo <bo.liu@xxxxxxxxxxxxxxxxx> > > --- > > > > The problem is gone after commit f280bf092d48 ("page cache: Convert > > find_get_entries to XArray"), but since xarray seems too new to backport > > to 4.19, I made this fix based on radix tree implementation. > > I think in this situation, since mainline does not need this change > and the bug has been buried under a major refactoring, is to send a > backport directly against the v4.19 kernel. Include notes about how it > replaces the fix that was inadvertently contained in f280bf092d48 > ("page cache: Convert find_get_entries to XArray"). Do you have a test > case that you can include in the changelog? The root cause behind the bug is exactly same as what commit d7782145e1ad ("filesystem-dax: Fix dax_layout_busy_page() livelock") does. For test case, I have a not 100% reproducible one based on ltp's rwtest[1] and virtiofs. [1]: $mount -t virtio_fs -o tag=alwaysdax -o rootmode=040000,user_id=0,group_id=0,dax,default_permissions,allow_other alwaysdax /mnt/virtio-fs/ $cat test.txt rwtest01 export LTPROOT; rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:$TMPDIR/rw-sync-$$ $runltp -d /mnt/virtio-fs -f test.txt thanks, -liubo