On 1/21/20 8:11 PM, Pavel Begunkov wrote: > On 22/01/2020 04:55, Jens Axboe wrote: >> On 1/21/20 5:05 PM, Pavel Begunkov wrote: >>> It works well for basic cases, but there is still work to be done. E.g. >>> it misses @hash_reg_file checks for the second (output) file. Anyway, >>> there are some questions I want to discuss: >>> >>> - why sqe->len is __u32? Splice uses size_t, and I think it's better >>> to have something wider (e.g. u64) for fututre use. That's the story >>> behind added sqe->splice_len. >> >> IO operations in Linux generally are INT_MAX, so the u32 is plenty big. >> That's why I chose it. For this specifically, if you look at splice: >> >> if (unlikely(len > MAX_RW_COUNT)) >> len = MAX_RW_COUNT; >> >> so anything larger is truncated anyway. > > Yeah, I saw this one, but that was rather an argument for the future. > It's pretty easy to transfer more than 4GB with sg list, but that > would be the case for splice. I don't see this changing, ever, basically. And probably not a big deal, if you want to do more than 2GB worth of IO, you simply splice them over multiple commands. At those sizes, the overhead there is negligible. >>> - it requires 2 fds, and it's painful. Currently file managing is done >>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm >>> thinking to make each opcode function handle file grabbing/putting >>> themself with some helpers, as it's done in the patch for splice's >>> out-file. >>> 1. Opcode handler knows, whether it have/needs a file, and thus >>> doesn't need extra checks done in common path. >>> 2. It will be more consistent with splice. >>> Objections? Ideas? >> >> Sounds reasonable to me, but always easier to judge in patch form :-) >> >>> - do we need offset pointers with fallback to file->f_pos? Or is it >>> enough to have offset value. Jens, I remember you added the first >>> option somewhere, could you tell the reasoning? >> >> I recently added support for -1/cur position, which splice also uses. So >> you should be fine with that. >> > > I always have been thinking about it as a legacy from old days, and > one of the problems of posix. It's not hard to count it in the > userspace especially in C++ or high-level languages, and is just > another obstacle for having a performant API. So, I'd rather get rid > of it here. But is there any reasons against? It's not always trivial to do in libraries, or programming languages even. That's why it exists. I would not expect anyone to use it outside of that. -- Jens Axboe