Re: Issue fixed by commit 53a59fc67f97 is surfacing again..

Michal Hocko <mhocko@xxxxxxxxxx> · Mon, 2 Jul 2018 11:27:13 +0200



On Fri 29-06-18 17:32:00, Laurent Dufour wrote:
[...]
> As Power is 64K page size based, MAX_GATHER_BATCH = 8189, so
> MAX_GATHER_BATCH_COUNT will not exceed 1.
> 
> So there is no way to loop in zap_pte_range() due to the batch's limit.
> I guess we are never hitting the workaround introduced in the commit
> 53a59fc67f97. By the way should cond_resched being called in zap_pte_range()
> when the flush is due to the batch's limit ?

Well, I guess you are missing 2 things here. zap path does cond_resched
once per pmd regardless of the batching. MAX_GATHER_BATCH_COUNT is there
to not accumulate too many pages to free at once after we are done with
the address space tear down (tlb_finish_mmu). So whatever is the
batching it should not have a big effect on the zap part.
[...]
> Anyway, this should not fix the soft lockup I'm facing because
> MAX_GATHER_BATCH_COUNT=1 on ppc64.
> 
> Indeed, I'm wondering if the 10K pages is too large in some cases, especially
> when the node is loaded, and contention on the pte lock is likely to happen.
> Here with less than 8k pages processed soft lockup are surfacing.
> 
> Should the MAX_GATHER_BATCH limit be forced to lower value on ppc64 or more
> code introduced to work around that ?

Have you tried to profile what is taking so long? Exit path is not
parallel to hit on pte locks and having many processes shouldn't add to
any lock contention I can see. Why is per-pmd cond_resched not enough?
-- 
Michal Hocko
SUSE Labs