Re: [PATCH] ceph: only dirty ITER_IOVEC pages for direct read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 2, 2018, at 00:33, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> 
> On Fri, Mar 30, 2018 at 9:14 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>> On Fri, 2018-03-30 at 11:24 +0100, Luis Henriques wrote:
>>> (CC'ing Jeff)
>>> 
>>> Ilya Dryomov <idryomov@xxxxxxxxx> writes:
>>> 
>>>> On Fri, Mar 30, 2018 at 9:26 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>>> On 30 Mar 2018, at 15:20, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>>>>> 
>>>>>> On Fri, Mar 30, 2018 at 5:21 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On 30 Mar 2018, at 00:49, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>> On Fri, Mar 16, 2018 at 4:32 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>> Hi Zheng,
>>>>>>>> 
>>>>>>>> Can you explain how this fixes the invalid memory access reported in
>>>>>>>> "KASAN: use-after-free Read in set_page_dirty_lock"?  Did you come up
>>>>>>>> with a different reproducer?
>>>>>>>> 
>>>>>>> 
>>>>>>> I don\u2019t know why KASAN reports use-after-free (like false alarm) in
>>>>>>> this case. I compared the code with other filesystem, found recent commits
>>>>>>> make other filesystems only dirty ITER_IOVEC pages. The change should also
>>>>>>> make KASAN silent.
>>>>>> 
>>>>>> Did you manage to reproduce that KASAN report?
>>>>>> 
>>>>> 
>>>>> no
>>>>> 
>>>>>> If it is a false positive, the patch shouldn't have been marked for
>>>>>> stable.
>>>>>> 
>>>>> 
>>>>> I CCed stable because fuse did the same
>>>>> 
>>>>> commit 8fba54aebbdf1f999738121922e74bf796ad60ee
>>>>> Author: Miklos Szeredi <mszeredi@xxxxxxxxxx>
>>>>> Date:   Wed Aug 24 18:17:04 2016 +0200
>>>>> 
>>>>>    fuse: direct-io: don't dirty ITER_BVEC pages
>>>>> 
>>>>>    When reading from a loop device backed by a fuse file it deadlocks on
>>>>>    lock_page().
>>>>> 
>>>>>    This is because the page is already locked by the read() operation done on
>>>>>    the loop device.  In this case we don't want to either lock the page or
>>>>>    dirty it.
>>>>> 
>>>>>    So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.
>>>> 
>>>> OK, so it is a potential deadlock which has nothing to do with that
>>>> KASAN report.  I have rewritten the changelog to reflect that, please
>>>> take a look:
>>>> 
>>>>  https://github.com/ceph/ceph-client/commit/85784f9395987a422fa04263e7c0fb13da11eb5c
>>> 
>>> Just as a side note, it's still trivial to get a cephfs-related KASAN
>>> report by running a fuzzer (trinity in my case) against a mainline
>>> kernel 4.16.0-rc7 with this fix backported.
>>> 
>> 
>> I suspect trinity is clobbering the array
>>> As Jeff mentioned in a different thread, splice syscall is broken on
>>> cephfs and the fix for it is still tagged as "DO NOT MERGE" in the
>>> ceph-client testing branch.  I still think there's some bug in Jeff's
>>> fix as I still see a crash occasionally, but I haven't yet finished
>>> debugging it.  Unfortunately, Jeff's fix is probably not appropriate for
>>> stable kernels (but may I'm wrong).
>>> 
>>> Cheers,
>> 
>> (cc'ing Al)
>> 
>> Yeah, I should have revisited this a while ago. So the deal is that I
>> posted this patch upstream around a year ago, but Al didn't really like
>> it:
>> 
>>    https://patchwork.kernel.org/patch/9541829/
>> 
>> He wanted to add some iov_iter_for_each_page infrastructure and base
>> the new function on that. I had thought he had something queued up
>> along those lines at one point, but I don't see anything in mainline so
>> far.
>> 
>> There is a iov_iter_for_each_range, but it doesn't handle userland
>> iovecs and doesn't seem to care about alignment. I think we'll need
>> both here.
>> 
>> I'm not sure however that a per-page callback is really that helpful
>> for callers like ceph_direct_read_write. We'll just end up having to do
>> more  work in the fs to batch up the pages before we add the op to the
>> req.
>> 
>> iov_iter_get_pages_alloc is really sort of what we want in this case,
>> and we want that function to stitch together longer arrays when it can.
>> 
>> Al, do you have any thoughts on what we should do here?
> 
> I think this can be fixed entirely in ceph by walking away from page
> vectors (barely tested):
> 
>  https://github.com/ceph/ceph-client/commits/wip-splice

The patch looks good to me. using bio_vecs make the code more clear.

Regards
Yan, Zheng


> 
> If folks would like to see something like iov_iter_get_bvecs{,_alloc}()
> in iov_iter.c, I can work on generalizing my helper (i.e. add maxbvecs
> and maxsize parameters -- I didn't need maxbvecs and maxsize is just
> "make a copy of the iterator and truncate it").  Otherwise we will
> probably merge it into ceph, as this code has remained broken for way
> too long...
> 
> Attached is the main patch from wip-splice.
> 
> Thanks,
> 
>                Ilya
> <0003-ceph-fix-iov_iter-issues-in-ceph_direct_read_write.patch>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux