On Wed, Jan 17, 2024 at 02:52:32PM +0000, Matthew Wilcox wrote: > On Wed, Jan 17, 2024 at 03:35:28PM +0100, Jan Kara wrote: > > OK. So could we then define the effect of your desired call as calling > > posix_fadvise(..., POSIX_FADV_DONTNEED) for every file? This is kind of > > best-effort eviction which is reasonably well understood by everybody. > > I feel like we're in an XY trap [1]. What Christian actually wants is > to not be able to access the contents of a file while the device it's > on is suspended, and we've gone from there to "must drop the page cache". > > We have numerous ways to intercept file reads and make them either > block or fail. The obvious one to me is security_file_permission() > called from rw_verify_area(). Can we do everything we need with an LSM? > > [1] https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem Nice idea and we do stuff like that in other scenarios such as [1] where we care about preventing _writes_ from occuring while a specific service hasn't been fully set up. So that has been going through my mind as well. And the LSM approach might be complementary. For example, if feasible, it could be activated _before_ the freeze operation only allowing the block layer initiated freeze. And then we can drop the page cache. But in this case the LSM approach isn't easily workable or solves the problem for Gnome. It would force the usage of a bpf LSM most likely as well. And the LSM would have to be activated when the filesystem is frozen and then deactivated when it is unfrozen. I'm not even sure that's currently easily doable. But the Gnome use-case wants to be able to drop file contents before they suspend the system. So the thread-model is wider than just someone being able to read contents on an active systems. But it's best-effort of course. So failing and reporting an error would be totally fine and then policy could dictate whether to not even suspend. It actually might help userspace in general. The ability to drop the page cache of a specific filesystem is useful independent of the Gnome use-case especially in systems with thousands or ten-thousands of services that use separate filesystem images something that's not uncommon. [1]: https://github.com/systemd/systemd/blob/74e6a7d84a40de18bb3b18eeef6284f870f30a6e/src/nsresourced/bpf/userns_restrict/userns-restrict.bpf.c