On Mon, Apr 08, 2024 at 05:02:37PM +0100, Matthew Wilcox wrote: > On Mon, Mar 25, 2024 at 10:44:43AM -0700, Linus Torvalds wrote: > > So I actually think most compiler people are perfectly fine with the > > kernel model of mostly doing 'volatile' not on the data structures > > themselves, but as accesses through casts. > > > > It's very traditional C, and there's actually nothing particularly odd > > about it. Not even from a compiler standpoint. > > > > In fact, I personally will argue that it is fundamentally wrong to > > think that the underlying data has to be volatile. A variable may be > > entirely stable in some cases (ie locks held), but not in others. > > > > So it's not the *variable* (aka "object") that is 'volatile', it's the > > *context* that makes a particular access volatile. > > > > That explains why the kernel has basically zero actual volatile > > objects, and 99% of all volatile accesses are done through accessor > > functions that use a cast to mark a particular access volatile. > > What annoys me is that 'volatile' accesses have (at least) two distinct > meanings: > - Make this access untorn > - Prevent various optimisations (code motion, > common-subexpression-elimination, ...) > > As an example, folio_migrate_flags() (in mm/migrate.c): > > if (folio_test_error(folio)) > folio_set_error(newfolio); > if (folio_test_referenced(folio)) > folio_set_referenced(newfolio); > if (folio_test_uptodate(folio)) > folio_mark_uptodate(newfolio); > > ... which becomes... > > 1f: f6 c4 04 test $0x4,%ah > 22: 74 05 je 29 <folio_migrate_flags+0x19> > 24: f0 80 4f 01 04 lock orb $0x4,0x1(%rdi) > 29: 48 8b 03 mov (%rbx),%rax > 2c: a8 04 test $0x4,%al > 2e: 74 05 je 35 <folio_migrate_flags+0x25> > 30: f0 80 4d 00 04 lock orb $0x4,0x0(%rbp) > 35: 48 8b 03 mov (%rbx),%rax > 38: a8 08 test $0x8,%al > 3a: 74 05 je 41 <folio_migrate_flags+0x31> > 3c: f0 80 4d 00 08 lock orb $0x8,0x0(%rbp) > > In my ideal world, the compiler would turn this into: > > newfolio->flags |= folio->flags & MIGRATE_MASK; > > but because folio_test_foo() and folio_set_foo() contain all manner of > volatile casts, the compiler is forced to do individual tests and sets. > > Part of that is us being dumb; folio_set_foo() should be __folio_set_foo() > because this folio is newly allocated and nobody else can be messing > with its flags word yet. I failed to spot that at the time I was doing > the conversion from SetPageFoo to folio_set_foo. > > But if the compiler people could give us something a little more > granular than "scary volatile access disable everything", that would > be nice. Also hard, because now you have to figure out what this new > thing interacts with and when is it safe to do what. OK, I will bite... Why not accumulate the changes in a mask, and then apply the mask the one time? (In situations where __folio_set_foo() need not apply.) If it turns out that we really do need a not-quite-volatile, what exactly does it do? You clearly want it to be able to be optimized so as to merge similar accesses. Is there a limit to the number of accesses that can be merged or to the region of code over which such merging is permitted? Either way, how is the compiler informed of these limits? (I admit that I am not crazy about this sort of proposal, but that might have something to do with the difficulty of repeatedly convincing people that volatile is necessary and must be retained...) Thanx, Paul