On 8/28/20 7:59 AM, Andi Kleen wrote: >> Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes") > > I believe the Fixes line is still not correct. The original patch > didn't have that problem. Please identify which patch added > the problematic code. Hi Andi, I agree with Muchun's assessment that the issue was caused by e5ff215941d5. Why? Commit e5ff215941d5 changed initialization of the .data field in the ctl_table structure for hugetlb_sysctl_handler. This was required because the commit introduced multiple hstates. As a result, it can not be known until runtime the value for .data. This code was added to hugetlb_sysctl_handler to set the value at runtime. @@ -1037,8 +1109,19 @@ int hugetlb_sysctl_handler(struct ctl_table *table, int write, struct file *file, void __user *buffer, size_t *length, loff_t *ppos) { + struct hstate *h = &default_hstate; + unsigned long tmp; + + if (!write) + tmp = h->max_huge_pages; + + table->data = &tmp; + table->maxlen = sizeof(unsigned long); The problem is that two threads running in parallel can modify table->data in the global data structure without any synchronization. Worse yet, is that that value is local to their stacks. That was the root cause of the issue addressed by Muchun's patch. Does that analysis make sense? Or, are we missing something. -- Mike Kravetz