On Mon, Jun 10, 2024 at 03:23:49PM +0200, Michal Hocko wrote: > On Tue 04-06-24 09:34:48, Byungchul Park wrote: > > On Mon, Jun 03, 2024 at 06:01:05PM +0100, Matthew Wilcox wrote: > > > On Mon, Jun 03, 2024 at 09:37:46AM -0700, Dave Hansen wrote: > > > > Yeah, we'd need some equivalent of a PTE marker, but for the page cache. > > > > Presumably some xa_value() that means a reader has to go do a > > > > luf_flush() before going any farther. > > > > > > I can allocate one for that. We've got something like 1000 currently > > > unused values which can't be mistaken for anything else. > > > > > > > That would actually have a chance at fixing two issues: One where a new > > > > page cache insertion is attempted. The other where someone goes to look > > > > in the page cache and takes some action _because_ it is empty (I think > > > > NFS is doing some of this for file locks). > > > > > > > > LUF is also pretty fundamentally built on the idea that files can't > > > > change without LUF being aware. That model seems to work decently for > > > > normal old filesystems on normal old local block devices. I'm worried > > > > about NFS, and I don't know how seriously folks take FUSE, but it > > > > obviously can't work well for FUSE. > > > > > > I'm more concerned with: > > > > > > - page goes back to buddy > > > - page is allocated to slab > > > > At this point, tlb flush needed will be performed in prep_new_page(). > > But that does mean that an unaware caller would get an additional > overhead of the flushing, right? I think it would be just a matter of pcp for locality is already a better source of side channel attack. FYI, tlb flush gets barely performed only if pending tlb flush exists. > time before somebody can turn that into a side channel attack, not to > mention unexpected latencies introduced. Nope. The pending tlb flush performed in prep_new_page() is the one that would've done already with the vanilla kernel. It's not additional tlb flushes but it's subset of all the skipped ones. It's worth noting all the existing mm reclaim mechaisms have already introduced worse unexpected latencies. Byungchul > -- > Michal Hocko > SUSE Labs