On Thu, Oct 28, 2021 at 11:38 AM Yuval Lifshitz <ylifshit@xxxxxxxxxx> wrote: > > thanks for the info (and context)! > > the first set of usecases I would like to tackle are actually non-symmetric ones (unlike encryption and compression). > for example: > * a client is reading the content of a bucket that has images but wants to present low-resolution versions of the images (e.g. thumbnails). while the original uploaded of the images is a different unrelated client app > * images may be watermarked based on which client fetches them > * etc. > it looks like most of the infrastructure is already there, and it should be straight forward to add a "lua filter" (deriving from RGWGetObj_Filter) to the list of filters. > > asymmetric "put" usecases should probably be handled out-of-band by using bucket notifications, where some external entity, reads the object, does the processing and writes the modified object just like any other client. > > the more challenging ones would be symmetric usecases (like encryption and compression), where the user might want to use different compression or encryption mechanism than the ones we are providing in our C++ code. for that, we would probably need to handle the "put" path as well inline. > > not sure if this is directly related to zipper as the RGWGetDataCB seems to be agnostic of the backend storage? right. for RGWGetObj, all the data eventually streams through RGWGetObj_CB to be written out to the client via the frontend. the decompression and decryption filters are wrapping that. so if we wanted to allow zipper stores the ability to transform the unencrypted/uncompressed stream, we'd want to give them a chance to wrap the stream before those other filters are added in rgw::sal, this could look something like: virtual RGWGetDataCB* add_filter(RGWGetDataCB* cb) { return cb; } which by default doesn't add any filters, but a store could override this to return some other filter that wraps 'cb' > > > > On Wed, Oct 27, 2021 at 8:08 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote: >> >> hi Yuval, i hope you don't mind a short history lesson here: >> >> in 2016, Adam Kupczyk wrote https://github.com/ceph/ceph/pull/11494 >> for inline object compression, which added the initial streaming >> abstractions RGWPutObjDataProcessor, RGWPutObj_Filter, and >> RGWGetObj_Filter. compression was implemented with the >> RGWPutObj_Compress and RGWGetObj_Decompress filters in >> src/rgw/rgw_compression.h >> >> in 2017, Adam added https://github.com/ceph/ceph/pull/11049 to support >> server-side encryption with RGWGetObj_BlockDecrypt and >> RGWPutObj_BlockEncrypt filters in src/rgw/rgw_crypt.h >> >> in 2018, i cleaned up the PutObj side in >> https://github.com/ceph/ceph/pull/24453, and distilled >> RGWPutObjDataProcessor down to a single pure virtual in class >> rgw::putobj::DataProcessor in src/rgw/rgw_putobj.h >> >> last year, Dan finalized the zipper write interfaces in >> https://github.com/ceph/ceph/pull/42550, which moved this >> DataProcessor into namespace rgw::sal, and used it as a basis for the >> rgw::sal::Writer interface in src/rgw/rgw_sal.h >> >> each zipper store implements this virtual function >> rgw::sal::Writer::process_data(), which sees every buffer segment that >> streams in from the client. each store has the ability to add extra >> filters, using something like rgw::putobj::Pipe to compose them into a >> pipeline >> >> for object reads, there's a separate class RGWGetDataCB with a similar >> virtual function handle_data(), which still provides the basis for the >> RGWGetObj_Filter used by compression and encryption. this is passed >> into zipper through rgw::sal::Object::ReadOp::iterate(), and each >> store should be able to wrap that with other filters >> >> the interaction between zipper and inline compression/encryption >> (which currently happen above zipper in rgw_op.cc) will need some more >> thought. because if zipper stores modify an encrypted or compressed >> stream, the filters above zipper won't be able to successfully >> decrypt/decompress it >> _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx