On Tue, Feb 13, 2024 at 02:18:10PM +0530, Charan Teja Kalla wrote: > An anon THP page is first added to swap cache before reclaiming it. > Initially, each tail page contains the proper swap entry value(stored in > ->private field) which is filled from add_to_swap_cache(). After > migrating the THP page sitting on the swap cache, only the swap entry of > the head page is filled(see folio_migrate_mapping()). > > Now when this page is tried to split(one case is when this page is again > migrated, see migrate_pages()->try_split_thp()), the tail pages > ->private is not stored with proper swap entry values. When this tail > page is now try to be freed, as part of it delete_from_swap_cache() is > called which operates on the wrong swap cache index and eventually > replaces the wrong swap cache index with shadow/NULL value, frees the > page. > > This leads to the state with a swap cache containing the freed page. > This issue can manifest in many forms and the most common thing observed > is the rcu stall during the swapin (see mapping_get_entry()). > > On the recent kernels, this issues is indirectly getting fixed with the > series[1], to be specific[2]. Then why can we not take that series? Taking one-off patches almost ALWAYS causes future problems, what are you going to do to prevent that here (merge and logic problems). > When tried to back port this series, it is observed many merge > conflicts and also seems dependent on many other changes. As backporting > to LTS branches is not a trivial one, the similar change from [2] is > picked as a fix. > > [1] https://lore.kernel.org/all/20230821160849.531668-1-david@xxxxxxxxxx/ > [2] https://lore.kernel.org/all/20230821160849.531668-5-david@xxxxxxxxxx/ Again, please try to take the original series, ESPECIALLY for stuff in -mm which is tricky and likely to blow up in odd ways in the future. So I will not take this unless the -mm maintainers agree it really is the only way forward. thanks, greg k-h