On Sun, Aug 25, 2024 at 02:55:41PM -0700, Hugh Dickins wrote: > The second issue is that swap is more slippery to work with than > folios or pages: in the folio_nr_pages() case, that number is stable > because we hold a refcount (which stops a THP from being split), and > later we'll be taking folio lock too. None of that in the swap case, > so (depending on how a large entry gets split) the xa_get_order() result > is not reliable. Likewise for other uses of xa_get_order() in this series. > > There needs to be some kind of locking or retry to make the order usable, > and to avoid shmem_free_swap() occasionally freeing more than it ought. > I'll give it thought after. My original thought was that we'd take a bit from the swap entry in order to indicate the order of the entry. I was surprised to see the xa_get_order() implementation, but didn't remember why it wouldn't work. Sorry. Anyway, that's how I think it should be fixed. Is that enough? Holding a reference on the folio prevents truncation, splitting, and so on. There's no reference to be held on a swap entry, so could we have some moderately implausible series of operations while holding only the RCU read lock that would cause us to go wrong?