On 03/29/2018 08:42 PM, Matthew Wilcox wrote: > From: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> > > Simplify the locking by taking the spinlock while we walk the tree on > the assumption that many acquires and releases of the lock will be > worse than holding the lock for a (potentially) long time. I see this change made in several of the patches and do not have a specific issue with it. As part of the XArray implementation you have XA_CHECK_SCHED = 4096. So, we drop locks and do a cond_resched after XA_CHECK_SCHED iterations. Just curious how you came up with that number. Apologies in advance if this was discussed in a previous round of reviews. > > We could replicate the same locking behaviour with the xarray, but would > have to be careful that the xa_node wasn't RCU-freed under us before we > took the lock. > > Signed-off-by: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> I did not do a detailed review of the XArray implementation. Only looked at the provided interfaces and their intended uses. If the interfaces work as specified, the changes here are fine. Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> -- Mike Kravetz > --- > mm/memfd.c | 43 ++++++++++++++++++------------------------- > 1 file changed, 18 insertions(+), 25 deletions(-) > > diff --git a/mm/memfd.c b/mm/memfd.c > index 4cf7401cb09c..3b299d72df78 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -21,7 +21,7 @@ > #include <uapi/linux/memfd.h> > > /* > - * We need a tag: a new tag would expand every radix_tree_node by 8 bytes, > + * We need a tag: a new tag would expand every xa_node by 8 bytes, > * so reuse a tag which we firmly believe is never set or cleared on shmem. > */ > #define SHMEM_TAG_PINNED PAGECACHE_TAG_TOWRITE > @@ -29,35 +29,28 @@ > > static void shmem_tag_pins(struct address_space *mapping) > { > - struct radix_tree_iter iter; > - void __rcu **slot; > - pgoff_t start; > + XA_STATE(xas, &mapping->i_pages, 0); > struct page *page; > + unsigned int tagged = 0; > > lru_add_drain(); > - start = 0; > - rcu_read_lock(); > - > - radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) { > - page = radix_tree_deref_slot(slot); > - if (!page || radix_tree_exception(page)) { > - if (radix_tree_deref_retry(page)) { > - slot = radix_tree_iter_retry(&iter); > - continue; > - } > - } else if (page_count(page) - page_mapcount(page) > 1) { > - xa_lock_irq(&mapping->i_pages); > - radix_tree_tag_set(&mapping->i_pages, iter.index, > - SHMEM_TAG_PINNED); > - xa_unlock_irq(&mapping->i_pages); > - } > > - if (need_resched()) { > - slot = radix_tree_iter_resume(slot, &iter); > - cond_resched_rcu(); > - } > + xas_lock_irq(&xas); > + xas_for_each(&xas, page, ULONG_MAX) { > + if (xa_is_value(page)) > + continue; > + if (page_count(page) - page_mapcount(page) > 1) > + xas_set_tag(&xas, SHMEM_TAG_PINNED); > + > + if (++tagged % XA_CHECK_SCHED) > + continue; > + > + xas_pause(&xas); > + xas_unlock_irq(&xas); > + cond_resched(); > + xas_lock_irq(&xas); > } > - rcu_read_unlock(); > + xas_unlock_irq(&xas); > } > > /* >