Re: [RFC] Smart fibration plugin ext_4321

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 6, 2017 at 2:44 PM, Edward Shishkin
<edward.shishkin@xxxxxxxxx> wrote:
> On 12/26/2016 11:13 PM, Dušan Čolić wrote:
>>
>> On Mon, Dec 26, 2016 at 7:47 PM, Edward Shishkin
>> <edward.shishkin@xxxxxxxxx> wrote:
>>>
>>>
>>>
>>> On 12/25/2016 02:59 AM, Dušan Čolić wrote:
>>>>
>>>> Fibration is a great way to decrease fragmentation and increase
>>>> throughput.
>>>> Currently there are 4 fibration plugins, lex, dot.o, ext_1 and ext_3
>>>> and they all have their upsides and downsides.
>>>>
>>>> Proposed fibration plugin combines them all so that it combines files
>>>> with same extensions for 1, 2. 3 and 4 character extension  in groups
>>>> and sorts them in same fiber group.
>>>>
>>>> With this fibration plugin all eg. xvid files would be in same group
>>>> in folder on disk sorted alphabetically
>>>
>>>
>>>
>>> What application wants all xvid files to be in the same group?
>>> Do you have any benchmark numbers which show advantages
>>> of the new plugin?
>>>
>> Xvid files are just an example.
>> ext_1234 fibration would be equal to sum of ext_1, ext_2, ext_3, ext_4
>> and dot_o in one.
>>
>> In currently default plugin (dot_o) we sort all files by name from the
>> start except .o files which we put at the end.
>> So if we had a source directory with .c .h and .o files in it files by
>> extension would be sorted like: chchchchchchchchoooooooooooooo
>> I presumed that in some use cases it is better to have files be sorted
>> ccccccccccchhhhhhhhhhhhhhoooooooooooo
>>
>> Hypothesis is to use the premise that files of same extension are in
>> same order of size to reduce fragmentation.
>
>
>
> What kind of fragmentation you are talking about?
> Internal (which results in "dead" disk space), or
> external (which results in a lot of "extents")?
>
External

> Edward.
>
>
>>
>> If we group files of same extension in groups in one directory, when
>> we write files of same extension after deletion of some files of one
>> extension  their group would be in same order as the deleted file so
>> they would be written in similar place and occupy the 'hole' of
>> similar size.
>> Ofc I am not talking about files of few kB size where Reiser4 is great
>> at packing but about files from few MB to few GB.
>>
>> Eg. directory with mp3 and xvid files. mp3s are on the order of MB and
>> xvid on the order of GB. If we sort them just by name order of xvid
>> and mp3 files in one directory would be random so when deleting the
>> smaller ones we would make random holes (like from
>> mxmxmxxmmmxxxxmxxmmmx to mx xmxx  mx  xmx mmmx).
>> With grouping of writing where all mp3s would be written first and all
>> xvid after them after some deletions we would have smaller holes
>> grouped first  and larger last (like from mmmmmmmmmmmmxxxxxxxxxx to mm
>> m   mmm mmxx xxx xxx) but the main thing that after writing we would
>> write mp3s in mp3 holes and xvid in xvid holes ergo. reduce
>> fragmentation  (like from mm m   mmm mmxx xxx xxx to
>> mmMmMMMmmmXmmxxXxxx xxx) that we would create if we would try to write
>> xvid over mp3 holes.
>>
>> One obvious use case where I hypothesize that this type of fibration
>> is better long term would be directories with content similar to usual
>> Downloads directory, a lot of different types (and siyes) of files
>> that get written and deleted a lot.
>>
>> ext_1234 fibration is the same as dot_o for directories with only one
>> or one and .o file extension.
>>
>> Ofc this is just a hypothesis that I would like to prove with some
>> fragmentation benchmarks but I wanted to hear your thoughts.
>>
>> And while I was looking through the code I found a part that I
>> comprehended, elegant and easy to understand so I wanted to make
>> something so I could learn more.
>>
>>
>>> Thanks,
>>> Edward.
>>>
>> Thank you for your time and effort
>>
>> Dushan
>>
>>
>>>
>>>>    so that we will avoid putting
>>>> small files between them and in that way reduce fragmentation. That
>>>> group (xvid 4 character extensions) would be among last groups under
>>>> one directory so that all small files would be written before it.
>>>>
>>>> Problem with the attached patch is that currently every fibre value is
>>>> defined as u64  (eg. static __u64 fibre_ext_3) but if I understood
>>>> correctly comments in kassign.c and fibration.c fibration part of the
>>>> key is only 7 bits long.
>>>> If that is true how did fibre_ext_3 worked?
>>>>
>>>> Thanks
>>>>
>>>> Dushan
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe reiserfs-devel"
>> in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux