Re: [PATCH v2] nfs: fix dio deadlock when O_DIRECT flag is flipped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Trond,

On Mon, Jan 19, 2015 at 9:17 PM, Trond Myklebust
<trond.myklebust@xxxxxxxxxxxxxxx> wrote:
> Hi Tao,
>
> On Mon, Jan 19, 2015 at 4:15 AM, Peng Tao <tao.peng@xxxxxxxxxxxxxxx> wrote:
>> Running xfstest generic/036, we hit VM_BUG_ON() in nfs_direct_IO().
>> 036 toggles O_DIRECT flag while IO is going on. We cannot simply remove
>> the VM_BUG_ON() there because we'll have a deadlock in the code path.
>> inode->i_mutex is taken when calling into ->direct_IO.
>> And nfs_file_direct_write() would grab inode->i_mutex again.
>>
>> nfs_file_write() and generic_file_write_iter() checks for O_DIRECT twice,
>> and it creates a race window if user space is playing with O_DIRECT flag.
>> Fix it by calling generic_perform_write() instead, so that nfs_direct_IO()
>> is only invoked in swap on nfs case.
>>
>> Suggested-by: Christoph Hellwig <hch@xxxxxxxxxxxxx>
>> Signed-off-by: Peng Tao <tao.peng@xxxxxxxxxxxxxxx>
>> ---
>>  fs/nfs/file.c | 41 ++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 40 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/nfs/file.c b/fs/nfs/file.c
>> index 2ab6f00..e98604a 100644
>> --- a/fs/nfs/file.c
>> +++ b/fs/nfs/file.c
>> @@ -662,6 +662,45 @@ static int nfs_need_sync_write(struct file *filp, struct inode *inode)
>>         return 0;
>>  }
>>
>> +static ssize_t nfs_file_buffer_write(struct kiocb *iocb, struct iov_iter *from)
>> +{
>> +       struct file *file = iocb->ki_filp;
>> +       struct address_space *mapping = file->f_mapping;
>> +       struct inode *inode = mapping->host;
>> +       ssize_t result = 0;
>> +       size_t count = iov_iter_count(from);
>> +       loff_t pos = iocb->ki_pos;
>> +       int ret;
>> +
>> +       mutex_lock(&inode->i_mutex);
>> +       /* We can write back this queue in page reclaim */
>> +       current->backing_dev_info = mapping->backing_dev_info;
>> +       ret = generic_write_checks(file, &pos, &count, 0);
>> +       if (ret)
>> +               goto out;
>> +
>> +       if (!count)
>> +               goto out;
>> +
>> +       iov_iter_truncate(from, count);
>> +       ret = file_remove_suid(file);
>> +       if (ret)
>> +               goto out;
>> +
>> +       ret = file_update_time(file);
>> +       if (ret)
>> +               goto out;
>> +
>> +       result = generic_perform_write(file, from, pos);
>> +       if (likely(result >= 0))
>> +               iocb->ki_pos = pos + result;
>> +
>> +out:
>> +       current->backing_dev_info = NULL;
>> +       mutex_unlock(&inode->i_mutex);
>> +       return result ? result : ret;
>> +}
>> +
>>  ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
>>  {
>>         struct file *file = iocb->ki_filp;
>> @@ -697,7 +736,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
>>         if (!count)
>>                 goto out;
>>
>> -       result = generic_file_write_iter(iocb, from);
>> +       result = nfs_file_buffer_write(iocb, from);
>>         if (result > 0)
>>                 written = result;
>>
>
> I still haven't got an answer to my question as to why we need to do
> all this if we don't actually want anything other than the swapfile
> code to use nfs_direct_IO(}?
Sorry that I was under the impression that you were asking advice from
Al and thus skipped your proposal as Al did not response.

> The xfstest with toggling O_DIRECT at
> random doesn't reflect any sane behaviour that we want to support; all
> we want to do is ensure that it doesn't deadlock. Why wouldn't the
> proposal that we add a line to return '0' if !IS_SWAPFILE(inode)
> suffice?
>
The reason for ->direct_IO being called here is because we check
O_DIRECT flag twice in the code path and it creates a window for user
space application to flip O_DIRECT flag value. The patch fixes that
and it matches other filesystem's implementation, which is to check
O_DIRECT only once. I agree with you that we can return 0 in
nfs_direct_IO for non-swapfile case if we want to simply declare not
to support such O_DIRECT toggling behaviour. Given that the generic
layer falls back to buffer IO if ->direct_IO returns 0, it will have
same effect as this patch.

Cheers,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux