Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> writes: > On Thu, May 13, 2021 at 02:46:10PM +0200, Peter Zijlstra wrote: >> Ah, I think I see what you meant to say, it would perhaps help if you >> write it like so: >> >> >> diff --git a/mm/swapfile.c b/mm/swapfile.c >> index 149e77454e3c..94735248dcd2 100644 >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -99,11 +99,10 @@ atomic_t nr_rotate_swap = ATOMIC_INIT(0); >> >> static struct swap_info_struct *swap_type_to_swap_info(int type) >> { >> - if (type >= READ_ONCE(nr_swapfiles)) >> + if (type >= MAX_SWAPFILES) >> return NULL; >> >> - smp_rmb(); /* Pairs with smp_wmb in alloc_swap_info. */ >> - return READ_ONCE(swap_info[type]); >> + return READ_ONCE(swap_info[type]); /* rcu_dereference() */ >> } >> >> static inline unsigned char swap_count(unsigned char ent) >> @@ -2869,14 +2868,11 @@ static struct swap_info_struct *alloc_swap_info(void) >> } >> if (type >= nr_swapfiles) { >> p->type = type; >> - WRITE_ONCE(swap_info[type], p); >> /* >> - * Write swap_info[type] before nr_swapfiles, in case a >> - * racing procfs swap_start() or swap_next() is reading them. >> - * (We never shrink nr_swapfiles, we never free this entry.) >> + * Publish the swap_info_struct. >> */ >> - smp_wmb(); >> - WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1); >> + smp_store_release(&swap_info[type], p); /* rcu_assign_pointer() */ >> + nr_swapfiles++; > > Yes, this does help, I didn't understand why smp_wmb stayed around in > the original post. > > I think the only access smp_store_release() orders is p->type. Wouldn't > it be kinda inconsistent to only initialize that one field before > publishing when many others would be done at the end of > alloc_swap_info() after the fact? In addition to p->type, *p is zeroed via kvzalloc(). > p->type doesn't seem special. For > instance, get_swap_page_of_type() touches si->lock soon after it calls > swap_type_to_swap_info(), so there could be a small window where there's > a non-NULL si with an uninitialized lock. We usually check the state of swap_info_struct before other operations. For example, we check si->swap_map in swap_start(). > It's not as if this is likely to be a problem in practice, it would just > make it harder to understand why smp_store_release is there. Maybe all > we need is a WRITE_ONCE, or if it's really necessary for certain fields > to be set before publication then move them up and explain? I think we have initialized all fields before publication :-). Best Regards, Huang, Ying