Aw: Re: Aw: Re: Ext4: Slow performance on first write after mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> One question regarding fallocate: I create a new file and do a 100MB
> fallocate 
> with FALLOC_FL_KEEP_SIZE. Then I write only 70MB to that file and close it.
> Is the 30 MB unused preallocated space still preallocated for that file
> after closing
> it? Or does a close release the preallocated space?

I did some tests and now I can answer it by myself ;-)
The space stays preallocated after closing the file. Also umount don't releases 
the space. Interesting!

I was testing concurrent fallocates and writes to the same file descriptor. It 
seems to work. If it is quick enough I cannot say at the moment.

Regards,
Frank

----- Original Nachricht ----
Von:     frankcmoeller@xxxxxxxx
An:      linux-ext4@xxxxxxxxxxxxxxx
Datum:   19.05.2013 12:01
Betreff: Re: Aw: Re: Ext4: Slow performance on first write after mount

> Hi Andreas,
> 
> > Part of the problem is that filesystems are rarely unmounted cleanly, so
> it
> > means that this information would need to be updated periodically to disk
> so
> > that it is available after a crash.
> > I wouldn't object to some kind of "lazy" updating of group information on
> > disk that at least gives the newly-mounted filesystem a rough idea of
> what
> > each group's usage is. It wouldn't have to be totally accurate (it
> wouldn't
> > replace the bitmaps), but maybe 2 bits per group would be enough as a
> > starting point?
> > For a 32 TB filesystem that would be about 16 4kB blocks of bits that
> would
> > be updated periodically (e.g. every five minutes or so). Since the
> allocator
> > will typically work in successive groups that might not cause too much
> > churn. 
> 
> Yes, you're right. The stored data wouldn't be 100% reliable. And yes, it
> would be really good if 
> right after mount the filesystem would knew something more to find a good
> group quicker.
> What do you think of this:
> 1. I read this already in some discussions: You already store the free space
> amount for every
>   group. Why not also storing how big the biggest contiguous free space
> block in a group is? Then you 
>   don't have to read the whole group.
> 2. What about a list (in memory and also stored on disk) with all unused
> groups (1 bit for every group).
>   If the allocator cannot find a good group within lets say half second, a
> group from this list is used.
>   The list is also not be 100% reliable (because of the mentioned unclean
> unmounts), so you need to search
>   a good group in the list. If no good group was found in the list, the
> allocator can continue searching.
>   This don't helps in all situations (e.g. almost full disk or every group
> contains a small amount of data),
>   but it should be in many cases much faster, if the list is not totally
> outdated.
> 
> > It would be possible to fallocate() at some expected size (e.g. average
> file
> > size) and then either truncate off the unused space, or fallocate() some
> > more in another thread when you are close to tunning out. 
> > If the fallocate() is done in a separate thread the latency can be hidden
> > from the main application?
> Adding a new thread for fallocate shouldn't be a big problem. But fallocate
> might 
> generate high disk usage (while searching for a good group). I don't know
> whether
> parallel writing from the other thread is quick enough.
> 
> One question regarding fallocate: I create a new file and do a 100MB
> fallocate 
> with FALLOC_FL_KEEP_SIZE. Then I write only 70MB to that file and close it.
> Is the 30 MB unused preallocated space still preallocated for that file
> after closing
> it? Or does a close release the preallocated space?
> 
> Regards,
> Frank
> 
> > 
> > Cheers, Andreas 
> > 
> > > And you have to take care about alignment and there are several threads
> in
> > the internet which explain why you shouldn't use it (or only in very
> special
> > situations and I don't think that my situation is one of them). And ext4
> > group initialization takes also place when using O_DIRECT (as said before
> > perhaps I did something wrong).
> > > 
> > > Regards,
> > > Frank
> > > 
> > > ----- Original Nachricht ----
> > > Von:     "Sidorov, Andrei" <Andrei.Sidorov@xxxxxxxxxx>
> > > An:      "frankcmoeller@xxxxxxxx" <frankcmoeller@xxxxxxxx>, ext4
> > development <linux-ext4@xxxxxxxxxxxxxxx>
> > > Datum:   17.05.2013 23:18
> > > Betreff: Re: Ext4: Slow performance on first write after mount
> > > 
> > >> Hi Frank,
> > >> 
> > >> Consider using bigalloc feature (requires reformat), preallocate space
> > >> with fallocate and use O_DIRECT for reads/writes. However, 188k writes
> > >> are too small for good throughput with O_DIRECT. You might also want
> to
> > >> adjust max_sectors_kb to something larger than 512k.
> > >> 
> > >> We're doing 6in+6out 20Mbps streams just fine.
> > >> 
> > >> Regards,
> > >> Andrei.
> > >> 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4"
> in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux