On Tue, Dec 17, 2019 at 05:15:46PM -0700, Jens Axboe wrote: > On 12/17/19 1:26 PM, Linus Torvalds wrote: > > On Tue, Dec 17, 2019 at 11:39 AM Linus Torvalds > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > >> > >> 'loff_t length' is not right. > > > > Looking around, it does seem to get used that way. Too much, though. > > > >>> + loff_t pos = data->pos; > >>> + loff_t length = pos + data->len; > >> > >> And WTH is that? "pos + data->len" is not "length", that's end. And this: > >> > >>> loff_t end = pos + length, done = 0; > >> > >> What? Now 'end' is 'pos+length', which is 'pos+pos+data->len'. > > > > But this is unrelated to the crazy types. That just can't bve right. > > Yeah, I fixed that one up, that was my error. > > >> Is there some reason for this horrible case of "let's allow 64-bit sizes?" > >> > >> Because even if there is, it shouldn't be "loff_t". That's an > >> _offset_. Not a length. > > > > We do seem to have a lot of these across filesystems. And a lot of > > confusion. Most of the IO reoutines clearly take or return a size_t > > (returning ssize_t) as the IO size. And then you have the > > zeroing/truncation stuff that tends to take loff_t. Which still smells > > wrong, and s64 would look like a better case, but whatever. > > > > The "iomap_zero_range() for truncate" case really does seem to need a > > 64-bit value, because people do the difference of two loff_t's for it. > > In fact, it almost looks like that function should take a "start , > > end" pair, which would make loff_t be the _right_ thing. Yeah. "loff_t length" always struck me as a little odd, but until now I hadn't heard enough complaining about it to put any effort into fixing the iomap_apply code that (afaict) mostly worked ok. But it shouldn't be a difficult change. > > Because "length" really is just (a positive) size_t normally. However, I don't think it's a good idea to reduce the @length argument to size_t (and the iomap_apply return value to ssize_t) because they're 32-bit values and doing that will force iomap to clamp lengths and return values to S32_MAX. Instituting a ~2G max on read and write calls is fine because those operate directly on file data (== slow), but the vfs already clamps the length before the iov gets to iomap. For the other iomap users that care more about the mappings and less about the data in those mappings (seek hole, seek data, fiemap, swap) it doesn't make much sense. If the filesystem can send back a 100GB extent map (e.g. holes in a sparse file, or we just have superstar allocation strategies), the fs should send that straight to the iomap actor function without having to cut that into 50x loop iterations. Looking ahead to things like file mapping leases, a (formerly wealthy) client should be able to request a mmap lease on 100GB worth of pmem and get the whole lease if the fs can allocate 100G at once. I like the idea of making the length parameter and the return value int64_t instead of loff_t. Is int64_t the preferred typedef or s64? I forget. > Honestly, I'd much rather leave the loff_t -> size_t/ssize_t to > Darrick/Dave, it's really outside the scope of this patch, and I'd > prefer not to have to muck with it. They probably feel the same way! Don't forget Christoph. Heh, we /did/ forget Christoph. :( Maybe they have better historical context since they invented this iomap mechanism for pnfs or something long before I came along. --D > -- > Jens Axboe >