On 2020/2/18 23:31, Jason Gunthorpe wrote: > On Tue, Feb 18, 2020 at 11:35:35AM +0800, Lang Cheng wrote: >> The hns3 driver sets "hclge_service_task" workqueue with >> WQ_MEM_RECLAIM flag in order to guarantee forward progress >> under memory pressure. > > Don't do that. WQ_MEM_RECLAIM is only to be used by things interlinked > with reclaimed processing. > > Work on queues marked with WQ_MEM_RECLAIM can't use GFP_KERNEL > allocations, can't do certain kinds of sleeps, can't hold certain > kinds of locks, etc. >From mlx5 driver, it seems that there is GFP_KERNEL allocations on wq marked with WQ_MEM_RECLAIM too: mlx5e_tx_timeout_work() -> mlx5e_safe_reopen_channels() -> mlx5e_safe_switch_channels() -> mlx5e_open_channels() kcalloc() is called with GFP_KERNEL in mlx5e_open_channels(), and mlx5e_tx_timeout_work() is queued with priv->wq, which is allocated with WQ_MEM_RECLAIM flags. see: mlx5e_netdev_init() -> create_singlethread_workqueue() >From the comment in kernel/workqueue.c, the work queued with wq with WQ_MEM_RECLAIM flag set seems to be executed without blocking under some rare case. I still not quite understand the comment, and I can not find any doc that point out the GFP_KERNEL allocations can not be done in wq with WQ_MEM_RECLAIM yet. Is there any doc that mentions that GFP_KERNEL allocations can not be done in wq with WQ_MEM_RECLAIM? /** * rescuer_thread - the rescuer thread function * @__rescuer: self * * Workqueue rescuer thread function. There's one rescuer for each * workqueue which has WQ_MEM_RECLAIM set. * * Regular work processing on a pool may block trying to create a new * worker which uses GFP_KERNEL allocation which has slight chance of * developing into deadlock if some works currently on the same queue * need to be processed to satisfy the GFP_KERNEL allocation. This is * the problem rescuer solves. * * When such condition is possible, the pool summons rescuers of all * workqueues which have works queued on the pool and let them process * those works so that forward progress can be guaranteed. * * This should happen rarely. * * Return: 0 */ The below is the reason we add the sets "hclge_service_task" workqueue with WQ_MEM_RECLAIM through analysing why other ethernet drivers has allocated wq with WQ_MEM_RECLAIM flag, I may be wrong about that: hns3 ethernet driver may be used as the low level transport of a network file system, memory reclaim data path may depend on the worker in hns3 driver to bring back the ethernet link so that it flush the some cache to network based disk. > > Jason > >