On 8/24/20 1:59 PM, Andrew Morton wrote: > On Sat, 22 Aug 2020 17:53:28 +0800 Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: > >> There is a race between the assignment of `table->data` and write value >> to the pointer of `table->data` in the __do_proc_doulongvec_minmax(). > > Where does __do_proc_doulongvec_minmax() write to table->data? > > I think you're saying that there is a race between the assignment of > ctl_table->table in hugetlb_sysctl_handler_common() and the assignment > of the same ctl_table->table in hugetlb_overcommit_handler()? > > Or not, maybe I'm being thick. Can you please describe the race more > carefully and completely? > I too am looking at this now and do not completely understand the race. It could be that: hugetlb_sysctl_handler_common ... table->data = &tmp; and, do_proc_doulongvec_minmax() ... return __do_proc_doulongvec_minmax(table->data, table, write, ... with __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, ... ... i = (unsigned long *) data; ... *i = val; So, __do_proc_doulongvec_minmax can be dereferencing and writing to the pointer in one thread when hugetlb_sysctl_handler_common is setting it in another? Another confusing part of the message is the stack trace which includes ... ? set_max_huge_pages+0x3da/0x4f0 ? alloc_pool_huge_page+0x150/0x150 which are 'downstream' from these routines. I don't understand why these are in the trace. If the race is with the pointer set and dereference/write, then this type of fix is OK. However, if you really have two 'sysadmin type' global operations racing then one or both are not going to get what they expected. Instead of changing the code to 'handle the race', I think it might be acceptable to just put a big semaphore around it. -- Mike Kravetz