Re: [RFC] Smart fibration plugin ext_4321

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 7, 2017 at 12:05 AM, Edward Shishkin
<edward.shishkin@xxxxxxxxx> wrote:
> On 01/07/2017 01:09 AM, Dušan Čolić wrote:
>>
>> On Fri, Jan 6, 2017 at 8:58 PM, Edward Shishkin
>> <edward.shishkin@xxxxxxxxx> wrote:
>>>
>>>
>>> On 01/06/2017 05:34 PM, Dušan Čolić wrote:
>>>>
>>>> On Fri, Jan 6, 2017 at 2:44 PM, Edward Shishkin
>>>> <edward.shishkin@xxxxxxxxx> wrote:
>>>>>
>>>>> On 12/26/2016 11:13 PM, Dušan Čolić wrote:
>>>>>>
>>>>>> On Mon, Dec 26, 2016 at 7:47 PM, Edward Shishkin
>>>>>> <edward.shishkin@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12/25/2016 02:59 AM, Dušan Čolić wrote:
>>>>>>>>
>>>>>>>> Fibration is a great way to decrease fragmentation and increase
>>>>>>>> throughput.
>>>>>>>> Currently there are 4 fibration plugins, lex, dot.o, ext_1 and ext_3
>>>>>>>> and they all have their upsides and downsides.
>>>>>>>>
>>>>>>>> Proposed fibration plugin combines them all so that it combines
>>>>>>>> files
>>>>>>>> with same extensions for 1, 2. 3 and 4 character extension  in
>>>>>>>> groups
>>>>>>>> and sorts them in same fiber group.
>>>>>>>>
>>>>>>>> With this fibration plugin all eg. xvid files would be in same group
>>>>>>>> in folder on disk sorted alphabetically
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> What application wants all xvid files to be in the same group?
>>>>>>> Do you have any benchmark numbers which show advantages
>>>>>>> of the new plugin?
>>>>>>>
>>>>>> Xvid files are just an example.
>>>>>> ext_1234 fibration would be equal to sum of ext_1, ext_2, ext_3, ext_4
>>>>>> and dot_o in one.
>>>>>>
>>>>>> In currently default plugin (dot_o) we sort all files by name from the
>>>>>> start except .o files which we put at the end.
>>>>>> So if we had a source directory with .c .h and .o files in it files by
>>>>>> extension would be sorted like: chchchchchchchchoooooooooooooo
>>>>>> I presumed that in some use cases it is better to have files be sorted
>>>>>> ccccccccccchhhhhhhhhhhhhhoooooooooooo
>>>>>>
>>>>>> Hypothesis is to use the premise that files of same extension are in
>>>>>> same order of size to reduce fragmentation.
>>>>>
>>>>>
>>>>>
>>>>> What kind of fragmentation you are talking about?
>>>>> Internal (which results in "dead" disk space), or
>>>>> external (which results in a lot of "extents")?
>>>>>
>>>> External
>>>>
>>>>> Edward.
>>>>>
>>>>>
>>>>>> If we group files of same extension in groups in one directory, when
>>>>>> we write files of same extension after deletion of some files of one
>>>>>> extension  their group would be in same order as the deleted file so
>>>>>> they would be written in similar place and occupy the 'hole' of
>>>>>> similar size.
>>>
>>>
>>>
>>> So "similar" means the same order, that is file sizes can differ in 2
>>> times?
>>> TBH, I don't see what can be deduced from this assumption ;)
>>> It can happen that new file either doesn't fit to that hole, or occupies
>>> too
>>> small place, so that next file won't fit to the rest of the hole..
>>>
>> OFC we can never guarantee that the new file completely fits the hole
>> (especially as we go through compression in next layer) but for both
>> smaller and larger file than a hole we would have higher probability
>> for less extents for situations with 2 or more types of files in a
>> directory. For one type of file in a directory behavior would be the
>> same as dot_o and ext_1 plugin.
>
>
>
> I should upset you: fibration plugins are about mapping of a semantic
> tree to the storage tree. Simply speaking, they manage mapping
> object-> key, which has nothing common with real locations on disk.
>
> This is a block allocator, who assigns disk addresses to nodes of the
> storage tree (right before writing them to disk at flush time).
> And I am sure that block allocator doesn't care about fibration groups.
>
> I strongly not recommend you to experiment with block allocator.
> Simply because I know how many people killed a lot of time without
> results.
Then what is this comment in the beginning of kassign.c about:


* In reiser4 every piece of file system data and meta-data has a key. Keys
* are used to store information in and retrieve it from reiser4 internal
* tree. In addition to this, keys define _ordering_ of all file system
* information: things having close keys are placed into the same or
* neighboring (in the tree order) nodes of the tree. As our block allocator
* tries to respect tree order (see flush.c), keys also define order in which
* things are laid out on the disk, and hence, affect performance directly.



>
> Edward.
>
--
To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux File System Development]     [Linux BTRFS]     [Linux NFS]     [Linux Filesystems]     [Ext4 Filesystem]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Resources]

  Powered by Linux