Re: [PATCH 4/4] ext3: Implement delayed allocation on page_mkwrite time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 03-05-11 19:09:48, Jan Kara wrote:
> On Mon 02-05-11 15:29:17, Andrew Morton wrote:
> > On Tue, 3 May 2011 00:20:20 +0200
> > Jan Kara <jack@xxxxxxx> wrote:
> > > On Mon 02-05-11 14:12:30, Andrew Morton wrote:
> > > > On Mon,  2 May 2011 22:56:56 +0200
> > > > Jan Kara <jack@xxxxxxx> wrote:
> > > > 
> > > > > So far, ext3 was allocating necessary blocks for mmapped writes when
> > > > > writepage() was called. There are several issues with this. The worst
> > > > > being that user is allowed to arbitrarily exceed disk quotas because
> > > > > writepage() is called from flusher thread context (which is root) and thus
> > > > > quota limits are ignored. Another bad consequence is that data is just lost
> > > > > if we find there's no space on the filesystem during ->writepage() time.
> > > > > 
> > > > > We solve these issues by implementing block reservation in page_mkwrite()
> > > > > callback. We don't want to really allocate blocks on page_mkwrite() time
> > > > > because for random writes via mmap (as seen for example with applications using
> > > > > BerkeleyDB) it results in much more fragmented files and thus much worse
> > > > > performance. So we allocate indirect blocks and reserve space for data block in
> > > > > page_mkwrite() and do the allocation of data block from writepage().
> > > > 
> > > > Yes, instantiating the metadata and accounting the data is a good
> > > > approach.  The file layout will be a bit suboptimal, but surely that
> > > > will be a minor thing.
> > > > 
> > > > But boy, it's a complicated patch!  Are we really sure that we want to
> > > > make changes this extensive to our antiquated old fs?  Or do we just
> > > > say "yeah, it's broken with quotas - use ext4"?
> > >   The patch isn't trivial, I agree (although it's mostly straightforward).
> > > Regarding telling users to switch to ext4 - it seems a bit harsh to me
> > > to ask people to switch to ext4 as a response to a (possibly security)
> > > issue they uncover. Because for most admins switching to ext4 will require
> > > some non-trivial testing I presume. Of course, the counterweight is the
> > > possibility of new bugs introduced to the code by my patch.
> > 
> > Yes.
> > 
> > > But after some
> > > considerations I've decided it's worth it and and fixed the bug...
> > 
> > Well.  How did you come to that decision?
>   So my thoughts were: If a company runs a hosting or similar service and
> some load either inadvertedly or even maliciously triggers this bug, your
> systems can be DOSed. That's bad and you need to fix that ASAP. From my
> experience with our SLE customers, they are willing to listen to advices
> such as fs choice when they plan a system deployment. After that they
> vehemently refuse any major change (and fs driver change is a major one).
> So I'm quite certain they'd rather accept this largish ext3 change.
> Finally, admittedly, I didn't think the patch will end up so large.
> 
> Looking into the patch, I could split off some cleanups and code
> reorganizations which are 20-30% of the patch but it probably does not make
> sense to split it more. What I think is a plus of the patch is that there
> are only two code paths that really change - ext3_get_blocks() has two new
> cases how it is called (to allocate only indirect blocks and to allocate
> already reserved data block) and trucate path which has to do more work to
> check whether indirect block can be removed.
>   
> > Are real users hurting from this problem?
>   I've got a report of this from NEC
> (http://ns3.spinics.net/lists/linux-ext4/msg20239.html) and OpenVZ people
> were also concerned
> (http://ns3.spinics.net/lists/linux-ext4/msg20288.html). I think there was
> one more report of this problem but I can't find it now. So yes, there are
> users who care.
> 
> >  What's the real-world case for fixing it?
>   Sorry, I don't understand the question (or how it is different from the
> previous one).
  Andrew, does this change your opinion in any way?

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux