On Wed, Jan 08, 2025 at 02:50:36PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > Several functions internal to FS DAX use the following pattern when > > trying to obtain an unlocked entry: > > > > xas_for_each(&xas, entry, end_idx) { > > if (dax_is_locked(entry)) > > entry = get_unlocked_entry(&xas, 0); > > > > This is problematic because get_unlocked_entry() will get the next > > present entry in the range, and the next entry may not be > > locked. Therefore any processing of the original locked entry will be > > skipped. This can cause dax_layout_busy_page_range() to miss DMA-busy > > pages in the range, leading file systems to free blocks whilst DMA > > operations are ongoing which can lead to file system corruption. > > > > Instead callers from within a xas_for_each() loop should be waiting > > for the current entry to be unlocked without advancing the XArray > > state so a new function is introduced to wait. > > Oh wow, good eye! > > Did this trip up an xfstest, or did you see this purely by inspection? Oh this was a "fun" one to track down :-) The other half of the story is in "fs/dax: Always remove DAX page-cache entries when breaking layouts". With just that patch applied xfstest triggered the new WARN_ON_ONCE in truncate_folio_batch_exceptionals(). That made no sense, because that patch makes breaking layouts also remove the DAX page-cache entries. Therefore no DAX page-cache entries should be found in truncate_folio_batch_exceptionals() which is now more of a sanity check. However due to the bug addressed by this patch DAX page-cache entries which should have been deleted as part of breaking layouts were being observed in truncate_folio_batch_exceptionals(). Prior to this series nothing would have noticed these being skipped because dax_delete_mapping_entry() doesn't check if the page is DMA idle. I believe this could lead to filesystem corruption if the locked entry was DMA-busy because the filesystem would assume the page was DMA-idle and therefore the underlying block free to be reallocated. However writing a test to actually prove this is tricky, and I didn't get time to do so. > > Also while we are here rename get_unlocked_entry() to > > get_next_unlocked_entry() to make it clear that it may advance the > > iterator state. > > Outside of the above clarification of how found / end user effect you > can add: > > Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>