On Tue, Jul 14, 2020 at 11:52:52AM +0100, John Garry wrote: > > > > In my machine, there are 32 queues(32 cpu cores), each queue has 1013 > > tags, so there can be 32*1013 requests coming from block layer, meantime > > smartpqi can only handles 1013 requests. I guess it isn't hard to > > trigger softlock by running heavy/concurrent smartpqi IO. > > Since pqi_alloc_io_request() does not use spinlock, disable preemption, rcu read lock is held when calling .queue_rq(), and preempt_disable() is implied in case that CONFIG_PREEMPT_RCU is off. A CPU looping in an RCU read-side critical section may cause some related issues, cause RCU's CPU Stall Detector will warn on that. > etc., so I guess that there is more of a chance of simply IO timeout. > > But I see in pqi_get_physical_disk_info() that there is some intelligence to > set the queue depth, which may reduce chance of timeout (by reducing disk > queue depth). Not sure. It may not work, see: [root@hp-dl380g10-01 mingl]# cat /sys/block/sd[a-f]/device/queue_depth 1013 1013 1013 1013 1013 1013 All sd[a-f] are smartpqi LUNs. Thanks, Ming