Fuad Tabba <tabba@xxxxxxxxxx> writes: This question should not block merging of this series since performance can be improved in a separate series: > <snip> > > + > +/* > + * Marks the range [start, end) as mappable by both the host and the guest. > + * Usually called when guest shares memory with the host. > + */ > +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) > +{ > + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; > + void *xval = xa_mk_value(KVM_GMEM_ALL_MAPPABLE); > + pgoff_t i; > + int r = 0; > + > + filemap_invalidate_lock(inode->i_mapping); > + for (i = start; i < end; i++) { Were any alternative data structures considered, or does anyone have suggestions for alternatives? Doing xa_store() in a loop here will take a long time for large ranges. I looked into the following: Option 1: (preferred) Maple trees Maple tree has a nice API, though it would be better if it can combine ranges that have the same value. I will have to dig into performance, but I'm assuming that even large ranges are stored in a few nodes so this would be faster than iterating over indices in an xarray. void explore_maple_tree(void) { DEFINE_MTREE(mt); mt_init_flags(&mt, MT_FLAGS_LOCK_EXTERN | MT_FLAGS_USE_RCU); mtree_store_range(&mt, 0, 16, xa_mk_value(0x20), GFP_KERNEL); mtree_store_range(&mt, 8, 24, xa_mk_value(0x32), GFP_KERNEL); mtree_store_range(&mt, 5, 10, xa_mk_value(0x32), GFP_KERNEL); { void *entry; MA_STATE(mas, &mt, 0, 0); mas_for_each(&mas, entry, ULONG_MAX) { pr_err("[%ld, %ld]: 0x%lx\n", mas.index, mas.last, xa_to_value(entry)); } } mtree_destroy(&mt); } stdout: [0, 4]: 0x20 [5, 10]: 0x32 [11, 24]: 0x32 Option 2: Multi-index xarray The API is more complex than maple tree's, and IIUC multi-index xarrays are not generalizable to any range, so the range can't be 8 1G pages + 1 4K page for example. The size of the range has to be a power of 2 that is greater than 4K. Using multi-index xarrays would mean computing order to store multi-index entries. This can be computed from the size of the range to be added, but is an additional source of errors. Option 3: Interval tree, which is built on top of red-black trees The API is set up at a lower level. A macro is used to define interval trees, the user has to deal with nodes in the tree directly and separately define functions to override sub-ranges in larger ranges. > + r = xa_err(xa_store(mappable_offsets, i, xval, GFP_KERNEL)); > + if (r) > + break; > + } > + filemap_invalidate_unlock(inode->i_mapping); > + > + return r; > +} > + > +/* > + * Marks the range [start, end) as not mappable by the host. If the host doesn't > + * have any references to a particular folio, then that folio is marked as > + * mappable by the guest. > + * > + * However, if the host still has references to the folio, then the folio is > + * marked and not mappable by anyone. Marking it is not mappable allows it to > + * drain all references from the host, and to ensure that the hypervisor does > + * not transition the folio to private, since the host still might access it. > + * > + * Usually called when guest unshares memory with the host. > + */ > +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) > +{ > + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; > + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); > + void *xval_none = xa_mk_value(KVM_GMEM_NONE_MAPPABLE); > + pgoff_t i; > + int r = 0; > + > + filemap_invalidate_lock(inode->i_mapping); > + for (i = start; i < end; i++) { > + struct folio *folio; > + int refcount = 0; > + > + folio = filemap_lock_folio(inode->i_mapping, i); > + if (!IS_ERR(folio)) { > + refcount = folio_ref_count(folio); > + } else { > + r = PTR_ERR(folio); > + if (WARN_ON_ONCE(r != -ENOENT)) > + break; > + > + folio = NULL; > + } > + > + /* +1 references are expected because of filemap_lock_folio(). */ > + if (folio && refcount > folio_nr_pages(folio) + 1) { > + /* > + * Outstanding references, the folio cannot be faulted > + * in by anyone until they're dropped. > + */ > + r = xa_err(xa_store(mappable_offsets, i, xval_none, GFP_KERNEL)); > + } else { > + /* > + * No outstanding references. Transition the folio to > + * guest mappable immediately. > + */ > + r = xa_err(xa_store(mappable_offsets, i, xval_guest, GFP_KERNEL)); > + } > + > + if (folio) { > + folio_unlock(folio); > + folio_put(folio); > + } > + > + if (WARN_ON_ONCE(r)) > + break; > + } > + filemap_invalidate_unlock(inode->i_mapping); > + > + return r; > +} > + > > <snip>