On 02/03, Steve French wrote:
I saw the following warning about workqueues when running xfstests with multichannel to Windows server target (at test generic/048, which failed due to umount busy). Any thoughts about whether WQ_UNBOUND would help?
Probably. Unless the worker relies/desires(*) the advantages of CPU locality, there's no reason to not use WQ_UNBOUND. (*) I'm not aware of the details of deferred close, but a quick look indicates it doesn't take any advantages nor depend on local CPU data. Manually rotating CPUs with queue_work_on() would also prevent a single CPU from starvation, but if doing it randomly, or without any specifc goal, would have the same effect as using WQ_UNBOUND in the first place, if not worse (probably). On a related note, I've been playing with the idea of spreading multichannel workloads across CPUs by "allocating" a CPU to each channel (assuming the client has N CPUs and defines N max_channels for best performance). The results are promising, but are far from justifiable yet (from the amount of modifications needed). For a quick, unchecked fix, setting cifsiod_wq to WQ_UNBOUND too seems beneficial, where on a multichannel setup, all cifsiod_wq work will be ran on a single CPU, that can starve much faster than deferred closes/lease breaks. Cheers, Enzo