On Mon, Jun 24, 2024 at 09:41:34AM +0200, Mateusz Guzik wrote: > On kernels compiled without this option (which is currently the default > state) filemap_nr_thps expands to 0. > > do_dentry_open has a big chunk dependent on it, most of which gets > optimized away, except for a branch and a full fence: > > if (f->f_mode & FMODE_WRITE) { > [snip] > smp_mb(); > if (filemap_nr_thps(inode->i_mapping)) { > [snip] > } > } > > While the branch is pretty minor the fence really does not need to be > there. > [the To: recipient bounces, thus got dropped] Now that I sent this I remembered that some of the atomic ops in Linux provide a full fence regardless of an architecture. get_write_access uses atomic_inc_unless_negative which qualifies? 1551 static __always_inline bool 1552 atomic_inc_unless_negative(atomic_t *v) 1553 { 1554 │ kcsan_mb(); 1555 │ instrument_atomic_read_write(v, sizeof(*v)); 1556 │ return raw_atomic_inc_unless_negative(v); 1557 } If so, the ifdefed-out smp_mb can be eliminated altogether if I got my fences right. On top of that mnt_get_write_access injects another smp_mb() anyway. > This is a bare-minimum patch which takes care of it until someone(tm) > cleans this up. Notably it does not conditionally compile other spots > which issue the matching fence. > > I did not bother benchmarking it, not issuing a spurious full fence in > the fast path does not warrant justification from perf standpoint. > > Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx> > --- > > I am not particularly familiar with any of this, the smp_mb in the open > for write path was sticking out like a sore thumb on code read so I > figured there may be One Weird Trick to whack it. > > If the stock code is correct as is, then the ifdef as above is fine. > > The ifdefed chunk is big enough that it should probably be its own > routine. I don't want to bikeshed so I did not go for it. > > For a moment I considered adding filemap_nr_thps_mb which would expand > to 0 or issue the fence + do the read, but then I figured a routine > claiming to post a fence and only conditionally do it is misleading at > best. > > As per the commit message fences in collapse_file remain compiled in. > It is unclear to me if the code following them is doing anything useful > on kernels !CONFIG_READ_ONLY_THP_FOR_FS. > > All that said, if there is cosmetic touch ups you want done here, I can > do them. > > However, a nice full patch would take care of all of the above and I > have neither the information needed to do it nor the interest to get it, > so should someone insinst on a full version I'm going to suggest they > write it themselves. I repeat this is merely a damage control until > someone sorts thigs out. > > fs/open.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/open.c b/fs/open.c > index 28f2fcbebb1b..654c300b3c33 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -980,6 +980,7 @@ static int do_dentry_open(struct file *f, > if ((f->f_flags & O_DIRECT) && !(f->f_mode & FMODE_CAN_ODIRECT)) > return -EINVAL; > > +#ifdef CONFIG_READ_ONLY_THP_FOR_FS > /* > * XXX: Huge page cache doesn't support writing yet. Drop all page > * cache for this file before processing writes. > @@ -1007,6 +1008,7 @@ static int do_dentry_open(struct file *f, > filemap_invalidate_unlock(inode->i_mapping); > } > } > +#endif > > return 0; > > -- > 2.43.0 >