On Thu, May 13, 2021 at 09:59:46PM -0400, Daniel Jordan wrote: > On Thu, May 13, 2021 at 02:46:10PM +0200, Peter Zijlstra wrote: > > Ah, I think I see what you meant to say, it would perhaps help if you > > write it like so: > > > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 149e77454e3c..94735248dcd2 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -99,11 +99,10 @@ atomic_t nr_rotate_swap = ATOMIC_INIT(0); > > > > static struct swap_info_struct *swap_type_to_swap_info(int type) > > { > > - if (type >= READ_ONCE(nr_swapfiles)) > > + if (type >= MAX_SWAPFILES) > > return NULL; > > > > - smp_rmb(); /* Pairs with smp_wmb in alloc_swap_info. */ > > - return READ_ONCE(swap_info[type]); > > + return READ_ONCE(swap_info[type]); /* rcu_dereference() */ > > } > > > > static inline unsigned char swap_count(unsigned char ent) > > @@ -2869,14 +2868,11 @@ static struct swap_info_struct *alloc_swap_info(void) > > } > > if (type >= nr_swapfiles) { > > p->type = type; > > - WRITE_ONCE(swap_info[type], p); > > /* > > - * Write swap_info[type] before nr_swapfiles, in case a > > - * racing procfs swap_start() or swap_next() is reading them. > > - * (We never shrink nr_swapfiles, we never free this entry.) > > + * Publish the swap_info_struct. > > */ > > - smp_wmb(); > > - WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1); > > + smp_store_release(&swap_info[type], p); /* rcu_assign_pointer() */ > > + nr_swapfiles++; > > Yes, this does help, I didn't understand why smp_wmb stayed around in > the original post. > > I think the only access smp_store_release() orders is p->type. Wouldn't > it be kinda inconsistent to only initialize that one field before > publishing when many others would be done at the end of > alloc_swap_info() after the fact? p->type doesn't seem special. For > instance, get_swap_page_of_type() touches si->lock soon after it calls > swap_type_to_swap_info(), so there could be a small window where there's > a non-NULL si with an uninitialized lock. > > It's not as if this is likely to be a problem in practice, it would just > make it harder to understand why smp_store_release is there. Maybe all > we need is a WRITE_ONCE, or if it's really necessary for certain fields > to be set before publication then move them up and explain? You also care about the zero fill from kvzalloc(). Without the smp_store_release() the zero-fill from the memset() might only be visible 'late'. Unless that also isn't a problem?