On Wed, Nov 28, 2018 at 01:01:12AM +0000, Wei Yang wrote: >On Mon, Nov 26, 2018 at 11:17:40PM -0800, Dave Hansen wrote: >>On 11/26/18 10:25 PM, Michal Hocko wrote: >>> [Cc Dave who has added the lock into this path. Maybe he remembers why] >> >>I don't remember specifically. But, the pattern of: >> >> allocate >> lock >> set >> unlock >> >>is _usually_ so we don't have two "sets" racing with each other. In >>this case, that would have been to ensure that two >>sparse_init_one_section()'s didn't race and leak one of the two >>allocated memmaps or worse. >> >>I think mem_hotplug_lock protects this case these days, though. I don't >>think we had it in the early days and were just slumming it with the >>pgdat locks. >> >>I really don't like the idea of removing the lock by just saying it >>doesn't protect anything without doing some homework first, though. It >>would actually be really nice to comment the entire call chain from the >>mem_hotplug_lock acquisition to here. There is precious little >>commenting in there and it could use some love. > >Dave, > >Thanks for your comment :-) > >I should put more words to the reason for removing the lock. > >Here is a simplified call trace for sparse_add_one_section() during >physical add/remove phase. > > __add_memory() > add_memory_resource() > mem_hotplug_begin() > > arch_add_memory() > add_pages() > __add_pages() > __add_section() > sparse_add_one_section(pfn) > > mem_hotplug_done() > >When we just look at the sparse section initialization, we can see the >contention happens when __add_memory() try to add a same range or range >overlapped in SECTIONS_PER_ROOT number of sections. Otherwise, they >won't access the same memory. > >If this happens, we may face two contentions: > > * reallocation of mem_section[root] > * reallocation of memmap and usemap > >While neither of them could be protected by the pgdat_resize_lock from >my understanding. Grab pgdat_resize_lock just slow down the process, >while finally they will replace the mem_section[root] and >ms->section_mem_map with their own new allocated data. > Hmm... sorry, I am not correct here. The pgdat_resize_lock do protect the second case. But not the first one. >Last bu not the least, to be honest, even the global mem_hotplug_lock >doesn't help in this situation. In case __add_memory() try to add the >same range twice, the sparse section would be initialized twice. Which >means it will be overwritten with the new allocated memmap/usermap. > The mem_section[root] still has a chance to face the contention here. >But maybe we have the assumption this reentrance will not happen. > >This is all what I understand, in case there is some misunderstanding, >please let me know. I will rewrite the changelog to emphasize this process is protected by the global mem_hotplug_lock. > >-- >Wei Yang >Help you, Help me -- Wei Yang Help you, Help me