вт, 26 мар. 2019 г. в 09:31, Aurelien Aptel <aaptel@xxxxxxxx>: > > In the oplock break handler, writing pending changes from pages puts > the FileInfo handle. If the refcount reaches zero it closes the handle > and waits for any oplock break handler to return, thus causing a deadlock. > > To prevent it we keep an additionnal reference of the SMB FileInfo > handle while we write/read pages so that when writepages puts the > handle, it won't close it. > > This was triggered by xfstest 464. > > Signed-off-by: Aurelien Aptel <aaptel@xxxxxxxx> > --- > fs/cifs/file.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > index 89006e044973..a9a515ea2e14 100644 > --- a/fs/cifs/file.c > +++ b/fs/cifs/file.c > @@ -4579,12 +4579,15 @@ void cifs_oplock_break(struct work_struct *work) > break_lease(inode, O_RDONLY); > else > break_lease(inode, O_WRONLY); > + > + cifsFileInfo_get(cfile); > rc = filemap_fdatawrite(inode->i_mapping); > if (!CIFS_CACHE_READ(cinode)) { > rc = filemap_fdatawait(inode->i_mapping); > mapping_set_error(inode->i_mapping, rc); > cifs_zap_mapping(inode); > } > + cifsFileInfo_put(cfile); > cifs_dbg(FYI, "Oplock flush inode %p rc %d\n", inode, rc); > } > > -- > 2.16.4 > You mentioned correctly in the other thread that this still doesn't prevent the deadlock and we need to detect in cifsFileInfo_put() that it is being called from cifs_oplock_break(). If there is a thread waiting on: oplock_break_cancelled = cancel_work_sync(&cifs_file->oplock_break); then this handle can't be referenced again by any other threads because it was removed from the open file lists. So, the extra reference shouldn't be acquired in oplock break handler routine but instead be taken before we submit the oplock break handler to the queue and while we are holding tcon->open_file_lock (see smb2_tcon_has_lease). Now, it come to the place where to put this reference. We need to make sure that the file is not used after we put the reference, so, I would suggest to try putting it at the very end of cifs_oplock_break() together with the extra flag (like you mentioned) indicating that we shouldn't call cancel_work_sync insde. -- Best regards, Pavel Shilovsky