On Wed, Feb 19, 2020 at 09:13:23AM +0800, Yunsheng Lin wrote: > On 2020/2/18 23:31, Jason Gunthorpe wrote: > > On Tue, Feb 18, 2020 at 11:35:35AM +0800, Lang Cheng wrote: > >> The hns3 driver sets "hclge_service_task" workqueue with > >> WQ_MEM_RECLAIM flag in order to guarantee forward progress > >> under memory pressure. > > > > Don't do that. WQ_MEM_RECLAIM is only to be used by things interlinked > > with reclaimed processing. > > > > Work on queues marked with WQ_MEM_RECLAIM can't use GFP_KERNEL > > allocations, can't do certain kinds of sleeps, can't hold certain > > kinds of locks, etc. > > From mlx5 driver, it seems that there is GFP_KERNEL allocations > on wq marked with WQ_MEM_RECLAIM too: > > mlx5e_tx_timeout_work() -> mlx5e_safe_reopen_channels() -> > mlx5e_safe_switch_channels() -> mlx5e_open_channels() > > kcalloc() is called with GFP_KERNEL in mlx5e_open_channels(), > and mlx5e_tx_timeout_work() is queued with priv->wq, which is > allocated with WQ_MEM_RECLAIM flags. see: > > mlx5e_netdev_init() -> create_singlethread_workqueue() There are two reasons for that, first mlx5 driver was written far before WQ_MEM_RECLAIM usage was clarified, second mlx5 has bugs. > > > From the comment in kernel/workqueue.c, the work queued with > wq with WQ_MEM_RECLAIM flag set seems to be executed without > blocking under some rare case. I still not quite understand > the comment, and I can not find any doc that point out the > GFP_KERNEL allocations can not be done in wq with WQ_MEM_RECLAIM > yet. Is there any doc that mentions that GFP_KERNEL allocations > can not be done in wq with WQ_MEM_RECLAIM? It is whole purpose of WQ_MEM_RECLAIM flag - allow progress in case of memory pressure. Allocation memory while we are under memory pressure is an invitation for a disaster. > > > /** > * rescuer_thread - the rescuer thread function > * @__rescuer: self > * > * Workqueue rescuer thread function. There's one rescuer for each > * workqueue which has WQ_MEM_RECLAIM set. > * > * Regular work processing on a pool may block trying to create a new > * worker which uses GFP_KERNEL allocation which has slight chance of > * developing into deadlock if some works currently on the same queue > * need to be processed to satisfy the GFP_KERNEL allocation. This is > * the problem rescuer solves. > * > * When such condition is possible, the pool summons rescuers of all > * workqueues which have works queued on the pool and let them process > * those works so that forward progress can be guaranteed. > * > * This should happen rarely. > * > * Return: 0 > */ > > > The below is the reason we add the sets "hclge_service_task" workqueue > with WQ_MEM_RECLAIM through analysing why other ethernet drivers has > allocated wq with WQ_MEM_RECLAIM flag, I may be wrong about that: Many drivers are developed using copy/paste technique, so it is wrong to assume that "other ethernet drivers" did the right thing. > > hns3 ethernet driver may be used as the low level transport of a > network file system, memory reclaim data path may depend on the > worker in hns3 driver to bring back the ethernet link so that it flush > the some cache to network based disk. Unlikely that this "network file system" dependency on ethernet link is correct. Thanks > > > > > Jason > > > > >