On 9/11/2023 7:24 AM, Dan Carpenter wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Hello Brett Creeley, The patch bb500dbe2ac6: "vfio/pds: Add VFIO live migration support" from Aug 7, 2023 (linux-next), leads to the following Smatch static checker warning: drivers/vfio/pci/pds/lm.c:117 pds_vfio_put_save_file() warn: sleeping in atomic context The call tree is: pds_vfio_state_mutex_unlock() <- disables preempt -> pds_vfio_put_save_file() <- sleeps drivers/vfio/pci/pds/vfio_dev.c 29 void pds_vfio_state_mutex_unlock(struct pds_vfio_pci_device *pds_vfio) 30 { 31 again: 32 spin_lock(&pds_vfio->reset_lock); ^^^^^^^^^ Preempt disabled 33 if (pds_vfio->deferred_reset) { 34 pds_vfio->deferred_reset = false; 35 if (pds_vfio->state == VFIO_DEVICE_STATE_ERROR) { 36 pds_vfio_put_restore_file(pds_vfio); 37 pds_vfio_put_save_file(pds_vfio); ^^^^^^^^^^^^^^^^^^^^^^ 38 pds_vfio_dirty_disable(pds_vfio, false); 39 } 40 pds_vfio->state = pds_vfio->deferred_reset_state; 41 pds_vfio->deferred_reset_state = VFIO_DEVICE_STATE_RUNNING; 42 spin_unlock(&pds_vfio->reset_lock); 43 goto again; 44 } 45 mutex_unlock(&pds_vfio->state_mutex); 46 spin_unlock(&pds_vfio->reset_lock); Unrelated but it really makes me itch that we drop the mutex before the spinlock.
This was done based on Mellanox's implementation and there's some history/notes on where this came from.
AFAIK these are the relevant pointers for the original discussion and a code comment as well:
Original thread: https://lore.kernel.org/netdev/20211019105838.227569-8-yishaih@xxxxxxxxxx/T/
Also, there is a comment in drivers/vfio/pci/mlx5/main.c inside of mlx5vf_pci_aer_reset_done().
Maybe it would be best to add a comment in pds_vfio_reset() as well? Thanks, Brett
47 } drivers/vfio/pci/pds/lm.c 112 void pds_vfio_put_save_file(struct pds_vfio_pci_device *pds_vfio) 113 { 114 if (!pds_vfio->save_file) 115 return; 116 --> 117 pds_vfio_put_lm_file(pds_vfio->save_file); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Calls mutex_lock(). 118 pds_vfio->save_file = NULL; 119 } regards, dan carpenter
Thanks for catching this. It looks like this is due to a misplaced spin_unlock in pds_vfio_state_mutex_unlock(). I need to make sure to call spin_unlock() before calling the functions that are allowed to sleep. This should be fine as these functions are already protected by the state mutex whenever they can be called.
Thanks, Brett