OSD 'copy-from' operation and truncate_seq value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

While working on implementing copy_file_range(2) for the kernel CephFS
client, I found an issue with truncated files that is described in [1].
The TL;DR is that, when executing a 'copy-from' OSD operation, both
truncate_seq and truncate_size are copied from the base object into the
target object. This, at least in the context of copy_file_range, doesn't
make sense and will cause problems if, for example, the target file had
previously been truncated, i.e. target.truncate_seq > base.truncate_seq
(see test case in [1]).

I've proposed a fix [2] but after discussing it with Gregory it sounds
more like a hack than a real solution.  Basically my patch simply adds a
new flag to the 'copy-from' operation which a client can use so that
truncate_{seq,size} aren't copied from the base object (and are *not*
changed with the copy operation).

Having my PR [2] tagged as 'pending-discussion', I decided to try to
kick-off this discussion here in the mailing-list, maybe grabbing
attention from other people with a deeper understanding of the OSD
internals.

Gregory's preferred solution would be to have the copy-from Op to allow
to set truncate_seq and truncate_size values directly.  Unfortunately,
there seems to be no easy way of changing the interfaces to allow this
to happen as the ceph_osd_op union (in rados.h) doesn't seem to be able
to accommodate these 2 extra fields in copy_from.  So, my initial
questions would be:

- What would be the options for extending copy-from to include this 2
  extra fields?
- I believe I understand the usage of truncate_{seq,size}, but it's
  really not clear to me whether there are any scenarios where we *do*
  want to modify truncate_{seq,size} while doing a copy-from.  In the
  case of CephFS I don't think a copy-from will ever truncate a file,
  so the values could be left unchanged.  But would the obvious solution
  of simply *never* copying these fields be a valid solution?  (I
  suspect the answer is 'no' :-)

Another problem is that the client will also need to figure out if the
OSDs have this issue fixed so that it can decide whether to use the Op
or not.  My PR adds a new OSD_COPY_FILE_RANGE feature, overlapping
SERVER_NAUTILUS, but Greg's preference is to use the osdmap.  The kernel
client does have _some_ support for osdmap but I believe some extra bits
would be required to use this map in this case (although I would need to
look closer to figure out what and how to do that).

Anyway, I'm looking for ideas on how to sort this out so that we can
have copy_file_range fixed in CephFS.

[1] https://tracker.ceph.com/issues/37378
[2] https://github.com/ceph/ceph/pull/25374

Cheers,
-- 
Luis



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux