On 09/05/2018 06:34 PM, Matthew Wilcox wrote:
On Wed, Sep 05, 2018 at 04:53:41PM +0530, Aneesh Kumar K.V wrote:
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
How do you go from "can be taken in softirq context" problem report to
"must disable hard interrupts" solution? Please explain why spin_lock_bh()
is not a sufficient fix.
swapper/68/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
0000000052a030a7 (hugetlb_lock){+.?.}, at: free_huge_page+0x9c/0x340
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd4/0x230
_raw_spin_lock+0x44/0x70
set_max_huge_pages+0x4c/0x360
hugetlb_sysctl_handler_common+0x108/0x160
proc_sys_call_handler+0x134/0x190
__vfs_write+0x3c/0x1f0
vfs_write+0xd8/0x220
Also, this only seems to trigger here. Is it possible we _already_
have softirqs disabled through every other code path, and it's just this
one sysctl handler that needs to disable softirqs? Rather than every
lock access?
Are you asking whether I looked at moving that put_page to a worker
thread? I didn't. The reason I looked at current patch is to enable the
usage of put_page() from irq context. We do allow that for non hugetlb
pages. So was not sure adding that additional restriction for hugetlb
is really needed. Further the conversion to irqsave/irqrestore was
straightforward.
Now with respect to making sure we don't have irq already disabled in
those code paths, I did check that. But let me know if you find anything
I missed.
I'm not seeing any analysis in this patch description, just a kneejerk
"lockdep complained, must disable interrupts".
-aneesh