Re: [PATCH v3 0/3] copy_file_range in cephfs kernel client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 7, 2018 at 2:11 AM, Luis Henriques <lhenriques@xxxxxxxx> wrote:
> Gregory Farnum <gfarnum@xxxxxxxxxx> writes:
>
>> I don't have much useful to say here (unless Zheng wants me to look
>> carefully at the use of copy-get), but I'm excited to see this getting
>> done! :)
>>
>> One thing I will note is that it might be a good idea if at least one
>> of the system admin or the Ceph cluster admin can disable this
>> behavior, just in case bugs turn up with the copy-from op. I don't
>> expect any as this is a pretty friendly use-case for it (protected by
>> FS caps, hurray!) but it is the first non-cache-tiering user that will
>> turn up in the wild I'm aware of.
>
> If I understand you correctly, you're talking about adding a knob so
> that the OSDs could simply return an error when requested to perform a
> 'copy-from' op, right?  That shouldn't be too difficult, I believe.

Oh heavens no. I mean a switch on the client side to not invoke the
copy-from op at all. Probably that would just mean doing a full read
of the source and write of the destination, but perhaps you could also
just unhook the copy_file_range implementation from the VFS?

I would assume this is pretty trivial but if it's complicated then
don't worry about it — I bring it up mostly because Josh thought there
were some outstanding issues with this RADOS op but on deeper
inspection he was wrong and conflating some other things. :)
-Greg

>
> Returning an error in PrimaryLogPG::do_osd_ops if that knob is set
> should do the job, but I really don't feel comfortable touching the OSDs
> code.  If you really think this is something useful, I can try to spend
> some time looking into that, but it will take me some time :)
>
> Cheers,
> --
> Luis
>
>> -Greg
>>
>> On Thu, Sep 6, 2018 at 9:06 AM, Luis Henriques <lhenriques@xxxxxxxx> wrote:
>>> Changes since v2:
>>>
>>> - Files size checks are now done after we have all the required caps
>>>
>>> Here's the main changes since v1, after Zheng's review:
>>>
>>> 1. ceph_osdc_copy_from() now receives source and destination snapids
>>>    instead of ceph_vino structs
>>>
>>> 2. Also get FILE_RD capabilities in ceph_copy_file_range() for source
>>>    file as other clients may have dirty data in their cache.
>>>
>>> 3. Fallback to VFS copy_file_range default implementation if we're
>>>   copying beyond source file EOF
>>>
>>> Note that 2. required an extra patch modifying ceph_try_get_caps() so
>>> that it could perform a non-blocking attempt at getting CEPH_CAP_FILE_RD
>>> capabilities.
>>>
>>> And here's the original (v1) RFC cover letter just for reference:
>>>
>>> This series is my initial attempt at getting a copy_file_range syscall
>>> implementation in the kernel cephfs client using the 'copy-from' RADOS
>>> operation.
>>>
>>> The idea of getting this implemented was from Greg -- or, at least, he
>>> created a feature in the tracker [1].  I just decided to give it a try
>>> as the feature wasn't assigned to anyone ;-)
>>>
>>> I have this patchset sitting on my laptop for a while already, waiting
>>> for me to revisit it, review some of its TODOs... but I finally decided
>>> to send it out as-is instead, to get some early feedback.
>>>
>>> The first patch implements the copy-from operation in the libceph
>>> module.  Unfortunately, the documentation for this operation is
>>> nonexistent and I had to do a lot of digging to figure out the details
>>> (and I probably I missed something!).  For example, initially I was
>>> hoping that this operation could be used to copy more than one object at
>>> the time.  Doing an OSD request per object copy is not ideal, but
>>> unfortunately it seems to be the only way.  Anyway, my expectations are
>>> that this new operation will be useful for other features in the future.
>>>
>>> The 2nd patch is where the copy_file_range is implemented and could
>>> probably be optimised, but I didn't bother with that for now.  The
>>> important bit is that we still may need to do some manual copies if the
>>> offsets aren't object aligned or if the length is smaller than the
>>> object size.  I'm using do_splice_direct() for the manual copies as it
>>> was the easiest way to get a PoC running, but maybe there are better
>>> ways.
>>>
>>> I've done some functional testing on this PoC.  And it also passes the
>>> generic xfstest suite, in particular the copy_file_range specific tests
>>> (430-434).  But I haven't done any benchmarks to measure any performance
>>> changes in using this syscall.
>>>
>>> Any feedback is welcome, specially regarding the TODOs on the code.
>>>
>>> [1] https://tracker.ceph.com/issues/21944
>>>
>>>
>>> Luis Henriques (3):
>>>   ceph: add non-blocking parameter to ceph_try_get_caps()
>>>   ceph: support the RADOS copy-from operation
>>>   ceph: support copy_file_range file operation
>>>
>>>  fs/ceph/addr.c                  |   2 +-
>>>  fs/ceph/caps.c                  |   7 +-
>>>  fs/ceph/file.c                  | 221 ++++++++++++++++++++++++++++++++
>>>  fs/ceph/super.h                 |   2 +-
>>>  include/linux/ceph/osd_client.h |  17 +++
>>>  include/linux/ceph/rados.h      |  19 +++
>>>  net/ceph/osd_client.c           |  72 +++++++++++
>>>  7 files changed, 335 insertions(+), 5 deletions(-)
>>>
>>



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux