On Tue, Jan 16, 2024 at 12:45:19PM +0100, Jan Kara wrote: > On Tue 16-01-24 11:50:32, Christian Brauner wrote: > > <snip the usecase details> > > > My initial reaction is to give userspace an API to drop the page cache > > of a specific filesystem which may have additional uses. I initially had > > started drafting an ioctl() and then got swayed towards a > > posix_fadvise() flag. I found out that this was already proposed a few > > years ago but got rejected as it was suspected this might just be > > someone toying around without a real world use-case. I think this here > > might qualify as a real-world use-case. > > > > This may at least help securing users with a regular dm-crypt setup > > where dm-crypt is the top layer. Users that stack additional layers on > > top of dm-crypt may still leak plaintext of course if they introduce > > additional caching. But that's on them. > > Well, your usecase has one substantial difference from drop_caches. You > actually *require* pages to be evicted from the page cache for security > purposes. And giving any kind of guarantees is going to be tough. Think for > example when someone grabs page cache folio reference through vmsplice(2), > then you initiate your dmSuspend and want to evict page cache. What are you > going to do? You cannot free the folio while the refcount is elevated, you > could possibly detach it from the page cache so it isn't at least visible > but that has side effects too - after you resume the folio would remain > detached so it will not see changes happening to the file anymore. So IMHO > the only thing you could do without problematic side-effects is report > error. Which would be user unfriendly and could be actually surprisingly > frequent due to trasient folio references taken by various code paths. I wonder though, if you start suspending userspace and the filesystem how likely are you to encounter these transient errors? > > Sure we could report error only if the page has pincount elevated, not only > refcount, but it needs some serious thinking how this would interact. > > Also what is going to be the interaction with mlock(2)? > > Overall this doesn't seem like "just tweak drop_caches a bit" kind of > work... So when I talked to the Gnome people they were interested in an optimal or a best-effort solution. So returning an error might actually be useful. I'm specifically put this here because my knowledge of the page cache isn't sufficient to make a judgement what guarantees are and aren't feasible. So I'm grateful for any insight here.