I'm trying to fix a couple bugs I've talked about in a previous post related to offloading, and in diving into workload.c I've hit some things that have me confused. If someone understands the code well and can give some pointers I'd be very grateful. First, is the reading of sw->flags in __get_submit_worker intentionally not protected by the sw->lock? I'm guessing so, with the intention that the worst that happens is a submit worker is seen as busy when it is actually idle. But this approach confuses me for a couple reasons. First, the td->io_u_freelist seems to have the same sort of thread contention issues when doing offloading. The main td thread is removing things from the freelist while any worker thread can insert into the freelist, so they all coordinate with the td->io_u_lock. Why not just use the same approach for maintaining the list of free submit workers? My second bit of confusion is that the number of io_u's allocated is exactly iodepth. That's also the number of submit workers allocated by the workqueue when offloading. Unless I'm mistaken (that's totally possible), a free io_u will be obtained by get_io_u, will be handed off to a submit worker, and when done, will become free again and the submit worker will become idle. So aren't there enough submit workers that there should always be one idle for any io_u that needs one? Thanks in advance for your help. - Nick