On 4/15/22 02:12, Yanjun Zhu wrote: > 在 2022/4/10 5:43, Bob Pearson 写道: >> On 4/9/22 00:04, Christoph Hellwig wrote: >>> On Fri, Apr 08, 2022 at 04:25:12PM -0700, Bart Van Assche wrote: >>>> One of the functions in the above call stack is sd_remove(). sd_remove() >>>> calls del_gendisk() just before calling sd_shutdown(). sd_shutdown() >>>> submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the >>>> following comment: "Fail any new I/O". Do you agree that failing new I/O >>>> before sd_shutdown() is called is wrong? Is there any other way to fix this >>>> than moving the blk_queue_start_drain() etc. calls out of del_gendisk() and >>>> into a new function? >>> >>> That SYNCHRONIZE CACHE is a passthrough command sent on the request_queue >>> and should not be affected by stopping all file system I/O. >> >> When I run check -q srp >> all the test cases pass but each one stops for 3+ minutes at synchronize cache. >> The rxe device is still active until sync cache returns when the last QP and the PD >> are destroyed. It may be that the queues are blocked waiting for something else >> even though they have reported success?? > > If you remove all the xarray patches and use the original source code. This will not occur. > > Zhu Yanjun > I missed one other point. The 3 minute delay is actually not a rxe bug at all but was recently caused by a bad scsi patch which has since been reverted.