Re: [PATCH RFC v2 1/6] fs: add i_state helpers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 23 Aug 2024, Linus Torvalds wrote:
> On Fri, 23 Aug 2024 at 08:14, NeilBrown <neilb@xxxxxxx> wrote:
> >
> > Maybe inode_wake_up_bit() should not have a barrier and the various
> > callers should add whichever barrier is appropriate.  That is the model
> > that Linus prefers for wake_up_bit() and for consistency it should apply
> > to inode_wake_up_bit() too.
> 
> I think that for the wake_up_bit() cases in particular, it's fairly
> easy to add whatever "clear_and_wakeup()" helpers. After all, there's
> only so much you can do to one bit, and I think merging the "act" with
> the "wakeup" into one routine is not only going to help fix the memory
> ordering problem, I think it's going to result in more readable code
> even aside from any memory ordering issues.

That patterns that don't easily fit a helper include:

  - skipping the wake_up when there is no hurry.
    nfs_page_clear_headlock() does this is PG_CONTENDED1
    isn't set.  intel_recv_event() does something very similar
    if STATE_FIRMWARE_LOADED isn't set.
    I imagine changing these to
         if (need a wakeup)
             clear_bit_and_wakeup(...)
         else
             clear_bit()
    rpc_make_runnable() has three case and maybe each would need
    a separate clear_bit in order to have a clear_bit adjacent to
    the wake_up().  That might actually be a good thing, I haven't
    really tried yet.

  - key_reject_and_link() separate the test_and_clear from the
    wake_up_bit() for reasons that aren't completely obvious to me.
    __key_instantiate_and_link() does the same.
    Maybe they just move the wake_up outside the mutex....

> 
> The wake_up_var() infrastructure that the inode code uses is a bit
> more involved. Not only can the variable be anything at all (so the
> operations you can do on it are obviously largely unbounded), but the
> inode hack in particular then uses one thing for the actual variable,
> and another thing for the address that is used to match up waits and
> wakeups.
> 
> So I suspect the inode code will have to do its thing explcitly with
> the low-level helpers and deal with the memory ordering issues on its
> own, adding the smp_mb() manually.

The problem here is that "wake_up_var()" doesn't *look* like a
low-level helper.  Maybe we could replace the few remaining instances
with __wake_up_var() once the easy cases are fixed??

Thanks,
NeilBrown




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux