On 3/16/20 8:50 PM, Naohiro Aota wrote: > On Fri, Feb 28, 2020 at 04:12:42PM +0900, Naohiro Aota wrote: >> With zonemode=zbd and asynchronous ioengine, a thread takes a zone lock >> before an I/O submission (in zbd_adjust_block() or >> zbd_convert_to_open_zone()) and releases the lock after the I/O is put (in >> zbd_put_io()). With a small number of open zones and/or a large number of >> jobs, threads can easily end up circular lock dependency and deadlocks. For >> example, thread A sends an I/O to zone 0, so thread A holds a zone lock #0. >> Then, thread A continues on zone 1 and try to acquire zone lock #1. At the >> same time, thread B held zone lock #1, sent I/O to zone 1, and try to >> acquire zone #0. Now, both threads are waiting for each other's lock, which >> is never released. >> >> This series fixes three problems to eliminate the deadlock. First, taking >> all the zone locks should be avoided. zbd_process_swd() and >> zbd_reset_zones() take the lock for all zones of the specified device, >> preventing other threads from accessing different zones in parallel. While >> it is not the root cause of the deadlock, such all zone locking easily >> trigger a deadlock. So, this series reduces such contentions by (1) >> eliminating unnecessary invocation of zbd_process_swd() and (2) changing to >> take single zone at at time in zbd_reset_zones(). >> >> Secondly, zbd's I/O issuing path should expect lock contention with other >> threads and handle the case properly. Commit 6f0c608564c3 ("zbd: Avoid >> async I/O multi-job workload deadlock") also addressed this issue by using >> pthread_mutex_try_lock() and io_u_quiesce(). However, there are more >> pthread_mutex_lock() left to be fixed in the same way. >> >> Finally, fio should clean up I/Os properly on an error case. Currently, >> cleanup_pending_aio() and io_u_quiesce() fail to clean up I/Os with an >> error. As a result, zone locks, which are held by an erroneous thread, are >> kept held and blocks other threads to acquire the locks. >> >> This series also add a test case to cause the deadlock with unpatched fio. >> >> Patches 1 and 2 avoid long range lock holding to reduce lock contentions. >> >> Patch 3 introduces zone_lock and use it to handle the lock contention case. >> >> Patches 4 and 5 fix error path to clean up all the pending I/Os left. >> >> Patch 6 adds the test. >> >> Naohiro Aota (6): >> zbd: avoid initializing swd when unnecessary >> zbd: reset one zone at a time >> zbd: use zone_lock to lock a zone >> backend: always clean up pending aios >> io_u: ensure io_u_quiesce() to process all the IOs >> zbd: add test for stressing zone locking >> >> backend.c | 5 --- >> io_u.c | 6 +-- >> t/zbd/test-zbd-support | 30 +++++++++++++++ >> zbd.c | 84 ++++++++++++++++-------------------------- >> 4 files changed, 65 insertions(+), 60 deletions(-) >> >> -- >> 2.25.1 > > Ping on this series (also on this > https://www.spinics.net/lists/fio/msg08322.html ). Damien, can you take a look please? -- Jens Axboe