On Wed, 2018-11-21 at 02:52 +0000, Wei Yang wrote: > On Tue, Nov 20, 2018 at 08:58:11AM +0100, osalvador@xxxxxxx wrote: > > > On the other hand I would like to see the global lock to go away > > > because > > > it causes scalability issues and I would like to change it to a > > > range > > > lock. This would make this race possible. > > > > > > That being said this is more of a preparatory work than a fix. > > > One could > > > argue that pgdat resize lock is abused here but I am not > > > convinced a > > > dedicated lock is much better. We do take this lock already and > > > spanning > > > its scope seems reasonable. An update to the documentation is > > > due. > > > > Would not make more sense to move it within the pgdat lock > > in move_pfn_range_to_zone? > > The call from free_area_init_core is safe as we are single-thread > > there. > > > > Agree. This would be better. > > > And if we want to move towards a range locking, I even think it > > would be more > > consistent if we move it within the zone's span lock (which is > > already > > wrapped with a pgdat lock). > > > > I lost a little here, just want to confirm with you. > > Instead of call pgdat_resize_lock() around > init_currently_empty_zone() > in move_pfn_range_to_zone(), we move init_currently_empty_zone() > before > resize_zone_range()? > > This sounds reasonable. Yeah. spanned pages are being touched in: - shrink_pgdat_span - resize_zone_range - init_currently_emty_zone The first two are already protected by the span lock. In init_currently_empty_zone, we also touch zone_start_pfn, which is part of the spanned pages (beginning), so I think it makes sense to also protect it with the span lock. We just call init_currently_empty_zone in case the zone is empty, so the race should be not existent to be honest. But I just think it is more consistent, and since moving it under spanlock would imply to also have it under pgdat lock, which was the main point of this, I think we do not have anything to lose.