Re: [PATCH 4/6] iomap: add struct iomap_ctx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 17, 2019 at 05:15:46PM -0700, Jens Axboe wrote:
> On 12/17/19 1:26 PM, Linus Torvalds wrote:
> > On Tue, Dec 17, 2019 at 11:39 AM Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> 'loff_t length' is not right.
> > 
> > Looking around, it does seem to get used that way. Too much, though.
> > 
> >>> +       loff_t pos = data->pos;
> >>> +       loff_t length = pos + data->len;
> >>
> >> And WTH is that? "pos + data->len" is not "length", that's end. And this:
> >>
> >>>         loff_t end = pos + length, done = 0;
> >>
> >> What? Now 'end' is 'pos+length', which is 'pos+pos+data->len'.
> > 
> > But this is unrelated to the crazy types. That just can't bve right.
> 
> Yeah, I fixed that one up, that was my error.
> 
> >> Is there some reason for this horrible case of "let's allow 64-bit sizes?"
> >>
> >> Because even if there is, it shouldn't be "loff_t". That's an
> >> _offset_. Not a length.
> > 
> > We do seem to have a lot of these across filesystems. And a lot of
> > confusion. Most of the IO reoutines clearly take or return a size_t
> > (returning ssize_t) as the IO size. And then you have the
> > zeroing/truncation stuff that tends to take loff_t. Which still smells
> > wrong, and s64 would look like a better case, but whatever.
> > 
> > The "iomap_zero_range() for truncate" case really does seem to need a
> > 64-bit value, because people do the difference of two loff_t's for it.
> > In fact, it almost looks like that function should take a "start ,
> > end" pair, which would make loff_t be the _right_ thing.

Yeah.  "loff_t length" always struck me as a little odd, but until now I
hadn't heard enough complaining about it to put any effort into fixing
the iomap_apply code that (afaict) mostly worked ok.  But it shouldn't
be a difficult change.

> > Because "length" really is just (a positive) size_t normally.

However, I don't think it's a good idea to reduce the @length argument
to size_t (and the iomap_apply return value to ssize_t) because they're
32-bit values and doing that will force iomap to clamp lengths and
return values to S32_MAX.  Instituting a ~2G max on read and write calls
is fine because those operate directly on file data (== slow), but the
vfs already clamps the length before the iov gets to iomap.

For the other iomap users that care more about the mappings and less
about the data in those mappings (seek hole, seek data, fiemap, swap) it
doesn't make much sense.  If the filesystem can send back a 100GB extent
map (e.g. holes in a sparse file, or we just have superstar allocation
strategies), the fs should send that straight to the iomap actor
function without having to cut that into 50x loop iterations.  Looking
ahead to things like file mapping leases, a (formerly wealthy) client
should be able to request a mmap lease on 100GB worth of pmem and get
the whole lease if the fs can allocate 100G at once.

I like the idea of making the length parameter and the return value
int64_t instead of loff_t.  Is int64_t the preferred typedef or s64?  I
forget.

> Honestly, I'd much rather leave the loff_t -> size_t/ssize_t to
> Darrick/Dave, it's really outside the scope of this patch, and I'd
> prefer not to have to muck with it. They probably feel the same way!

Don't forget Christoph.  Heh, we /did/ forget Christoph. :(
Maybe they have better historical context since they invented this iomap
mechanism for pnfs or something long before I came along.

--D

> -- 
> Jens Axboe
> 



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux