> >> > >>> If the device is > >>> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is > >>> not set, you should also decline to mount the filesystem. > >> > >> What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems > >> being able to use large folios? > >> > >> If that's an actual dependency of using large folios, then we're at > >> the point where the mm side of large folios needs to be divorced > >> from CONFIG_TRANSPARENT_HUGEPAGE and always supported. > >> Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the > >> block layer and also every filesystem that wants to support > >> sector/blocks sizes larger than PAGE_SIZE. IOWs, large folio > >> support needs to *always* be enabled on systems that say > >> CONFIG_BLOCK=y. > > > > Why CONFIG_BLOCK? I think it is enough if it comes from the FS side > > right? And for now, the only FS that needs that sort of bs > ps > > guarantee is XFS with this series. Other filesystems such as bcachefs > > that call mapping_set_large_folios() only enable it as an optimization > > and it is not needed for the filesystem to function. > > > > So this is my conclusion from the conversation: > > - Add a dependency in Kconfig on THP for XFS until we fix the dependency > > of large folios on THP > > THP isn't supported on some arches, so isn't this effectively saying XFS can no > longer be used with those arches, even if the bs <= ps? I think while pagecache > large folios depend on THP, you need to make this a mount-time check in the FS? > > But ideally, MAX_PAGECACHE_ORDER would be set to 0 for > !CONFIG_TRANSPARENT_HUGEPAGE so you can just check against that and don't have > to worry about THP availability directly. Yes, that would be better. We should have a way to probe it during mount time without requiring any address_space mapping. We could have a helper something as follows: static inline unsigned int mapping_max_folio_order_supported() { if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return 0; return MAX_PAGECACHE_ORDER; } This could be used by the FS to verify during mount time. > > Willy; Why is MAX_PAGECACHE_ORDER set to 8 when THP is disabled currently? > This appeared in this patch with the following comment: https://lore.kernel.org/linux-fsdevel/20230710130253.3484695-8-willy@xxxxxxxxxxxxx/ +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ > > - Add a BUILD_BUG_ON(XFS_MAX_BLOCKSIZE > MAX_PAGECACHE_ORDER) > > - Add a WARN_ON_ONCE() and clamp the min and max value in > > mapping_set_folio_order_range() ? > > > > Let me know what you all think @willy, @dave and @ryan. > > > > -- > > Pankaj >