On Fri 29-06-18 17:32:00, Laurent Dufour wrote: [...] > As Power is 64K page size based, MAX_GATHER_BATCH = 8189, so > MAX_GATHER_BATCH_COUNT will not exceed 1. > > So there is no way to loop in zap_pte_range() due to the batch's limit. > I guess we are never hitting the workaround introduced in the commit > 53a59fc67f97. By the way should cond_resched being called in zap_pte_range() > when the flush is due to the batch's limit ? Well, I guess you are missing 2 things here. zap path does cond_resched once per pmd regardless of the batching. MAX_GATHER_BATCH_COUNT is there to not accumulate too many pages to free at once after we are done with the address space tear down (tlb_finish_mmu). So whatever is the batching it should not have a big effect on the zap part. [...] > Anyway, this should not fix the soft lockup I'm facing because > MAX_GATHER_BATCH_COUNT=1 on ppc64. > > Indeed, I'm wondering if the 10K pages is too large in some cases, especially > when the node is loaded, and contention on the pte lock is likely to happen. > Here with less than 8k pages processed soft lockup are surfacing. > > Should the MAX_GATHER_BATCH limit be forced to lower value on ppc64 or more > code introduced to work around that ? Have you tried to profile what is taking so long? Exit path is not parallel to hit on pte locks and having many processes shouldn't add to any lock contention I can see. Why is per-pmd cond_resched not enough? -- Michal Hocko SUSE Labs