Re: [RFC] Smart fibration plugin ext_4321

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 01/06/2017 05:34 PM, Dušan Čolić wrote:
On Fri, Jan 6, 2017 at 2:44 PM, Edward Shishkin
<edward.shishkin@xxxxxxxxx> wrote:
On 12/26/2016 11:13 PM, Dušan Čolić wrote:
On Mon, Dec 26, 2016 at 7:47 PM, Edward Shishkin
<edward.shishkin@xxxxxxxxx> wrote:


On 12/25/2016 02:59 AM, Dušan Čolić wrote:
Fibration is a great way to decrease fragmentation and increase
throughput.
Currently there are 4 fibration plugins, lex, dot.o, ext_1 and ext_3
and they all have their upsides and downsides.

Proposed fibration plugin combines them all so that it combines files
with same extensions for 1, 2. 3 and 4 character extension  in groups
and sorts them in same fiber group.

With this fibration plugin all eg. xvid files would be in same group
in folder on disk sorted alphabetically


What application wants all xvid files to be in the same group?
Do you have any benchmark numbers which show advantages
of the new plugin?

Xvid files are just an example.
ext_1234 fibration would be equal to sum of ext_1, ext_2, ext_3, ext_4
and dot_o in one.

In currently default plugin (dot_o) we sort all files by name from the
start except .o files which we put at the end.
So if we had a source directory with .c .h and .o files in it files by
extension would be sorted like: chchchchchchchchoooooooooooooo
I presumed that in some use cases it is better to have files be sorted
ccccccccccchhhhhhhhhhhhhhoooooooooooo

Hypothesis is to use the premise that files of same extension are in
same order of size to reduce fragmentation.


What kind of fragmentation you are talking about?
Internal (which results in "dead" disk space), or
external (which results in a lot of "extents")?

External

Edward.


If we group files of same extension in groups in one directory, when
we write files of same extension after deletion of some files of one
extension  their group would be in same order as the deleted file so
they would be written in similar place and occupy the 'hole' of
similar size.


So "similar" means the same order, that is file sizes can differ in 2 times?
TBH, I don't see what can be deduced from this assumption ;)
It can happen that new file either doesn't fit to that hole, or occupies too
small place, so that next file won't fit to the rest of the hole..

Edward.


Ofc I am not talking about files of few kB size where Reiser4 is great
at packing but about files from few MB to few GB.

Eg. directory with mp3 and xvid files. mp3s are on the order of MB and
xvid on the order of GB. If we sort them just by name order of xvid
and mp3 files in one directory would be random so when deleting the
smaller ones we would make random holes (like from
mxmxmxxmmmxxxxmxxmmmx to mx xmxx  mx  xmx mmmx).
With grouping of writing where all mp3s would be written first and all
xvid after them after some deletions we would have smaller holes
grouped first  and larger last (like from mmmmmmmmmmmmxxxxxxxxxx to mm
m   mmm mmxx xxx xxx) but the main thing that after writing we would
write mp3s in mp3 holes and xvid in xvid holes ergo. reduce
fragmentation  (like from mm m   mmm mmxx xxx xxx to
mmMmMMMmmmXmmxxXxxx xxx) that we would create if we would try to write
xvid over mp3 holes.

One obvious use case where I hypothesize that this type of fibration
is better long term would be directories with content similar to usual
Downloads directory, a lot of different types (and siyes) of files
that get written and deleted a lot.

ext_1234 fibration is the same as dot_o for directories with only one
or one and .o file extension.

Ofc this is just a hypothesis that I would like to prove with some
fragmentation benchmarks but I wanted to hear your thoughts.

And while I was looking through the code I found a part that I
comprehended, elegant and easy to understand so I wanted to make
something so I could learn more.


Thanks,
Edward.

Thank you for your time and effort

Dushan


    so that we will avoid putting
small files between them and in that way reduce fragmentation. That
group (xvid 4 character extensions) would be among last groups under
one directory so that all small files would be written before it.

Problem with the attached patch is that currently every fibre value is
defined as u64  (eg. static __u64 fibre_ext_3) but if I understood
correctly comments in kassign.c and fibration.c fibration part of the
key is only 7 bits long.
If that is true how did fibre_ext_3 worked?

Thanks

Dushan

--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel"
in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux