Re: buffer-cache builds up with invalidate=1 too

Paolo Valente <paolo.valente@xxxxxxxxxx> · Mon, 30 Oct 2017 07:59:29 +0100

> Il giorno 28 ott 2017, alle ore 16:20, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
> 
> On 10/28/2017 02:37 AM, Paolo Valente wrote:
>> 
>>> Il giorno 27 ott 2017, alle ore 16:21, Jens Axboe <axboe@xxxxxxxxx> ha scritto:
>>> 
>>> On 10/27/2017 12:52 AM, Paolo Valente wrote:
>>>> [RESENDING, BECAUSE REJECTED BY THE VGER]
>>>> 
>>>>> Il giorno 27 ott 2017, alle ore 08:22, Paolo Valente <paolo.valente@xxxxxxxxxx> ha scritto:
>>>>> 
>>>>> 
>>>>> 
>>>>> Il 26/ott/2017 06:32 AM, "Jens Axboe" <axboe@xxxxxxxxx> ha scritto:
>>>>> On 10/24/2017 08:10 AM, Paolo Valente wrote:
>>>>>> 
>>>>>>> Il giorno 24 ott 2017, alle ore 08:28, Sitsofe Wheeler <sitsofe@xxxxxxxxx> ha scritto:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> If memory serves it's actually slightly more complicated. If you are
>>>>>>> using loops=<number> then I *think* (you'll have to check) you will
>>>>>>> find that invalidation happens once per each loop start. However when
>>>>>>> you use time_based to do the repetition there is essentially only one
>>>>>>> "loop" (even though the job goes on forever) so loop actions only
>>>>>>> happen right at the start of the job with that option (that's why I
>>>>>>> put the scare quotes around "beginning" ;-).
>>>>>>> 
>>>>>> 
>>>>>> Thanks for this additional, useful piece of information.  Actually,
>>>>>> this further, possibly different caching behavior makes me think that
>>>>>> some extra comment in the manpage might be helpful.
>>>>> 
>>>>> Would probably make sense to change 'invalidate' to be a range of
>>>>> possible values:
>>>>> 
>>>>> 0       As it is now, never invalidate
>>>>> 1       As it is now, invalidate initially
>>>>> once    Same as '1', invalidate initially / once
>>>>> open    New value, invalidate on every open
>>>>> close   New value, invalidate on close
>>>>> 
>>>>> as I can definitely see reasons why you would want to invalidate every
>>>>> time you open the file.
>>>>> 
>>>>> To do that, the 'invalidate' option should be changed from a
>>>>> FIO_OPT_BOOL to a FIO_OPT_STR, and the above possible values should be
>>>>> added as posval[] for that option.
>>>>> 
>>>>> Compliment that with the an enum of ranges for the ovals:
>>>>> 
>>>>> enum {
>>>>>      FIO_FILE_INVALIDATE_OFF = 0,
>>>>>      FIO_FILE_INVALIDATE_ONCE,
>>>>>      FIO_FILE_INVALIDATE_OPEN,
>>>>>      FIO_FILE_INVALIDATE_CLOSE
>>>>> };
>>>>> 
>>>>> Hope this makes sense, should be trivial to add as most of the work is
>>>>> already documented in this email :-). The remaining bits is just calling
>>>>> file_invalidate_cache() in the proper locations,
>>>>> td_io_{open,close}_file() would be prime candidates.
>>>>> 
>>>>> IMO this solution would make things both clearer and more flexible
>>> 
>>> See my followup, fio already does invalidates for each open. The
>>> problem with time_based was that we just reset the file, we don't
>>> close and re-open it. That was fixed in git:
>>> 
>>> commit 0bcf41cdc22dfee6b3f3b2ba9a533b4b103c70c2
>>> Author: Jens Axboe <axboe@xxxxxxxxx>
>>> Date:   Thu Oct 26 12:08:20 2017 -0600
>>> 
>>>  io_u: re-invalidate cache when looping around without file open/close
>>> 
>>> so current git should work for your test case. Please test.
>>> 
>> 
>> Tested, it does solve the problem.  As a side note, and if useful for
>> you, the throughput is much higher with sequential reads and direct=0
>> (4.14-rc5, virtual disk on an SSD).  It happens because of merges,
>> which seem to not occur with direct=1.  I thought direct I/O skipped
>> buffering, but still enjoyed features as request merging, but probably
>> I'm just wrong.
> 
> What does your job file look like? If you have something like bs=4k,
> then it's readahead saving your bacon with buffered. For O_DIRECT
> and bs=4k, each IO is sync, so it's hard/impossible to get merging.
> You need batched submission for that, through libaio for instance.
> 

Yes.  I tried several configurations in my job file, to thoroughly
test the cache issue I reported, and I ended up setting sync and 4k.
Then I reported this throughput gap, just as a possibly useful piece
of information, but without first getting to the bottom of it.  Your
feedback will help me bear in mind this other important difference
between direct and buffered: direct doesn't enjoy readahead.

Thanks,
Paolo

> -- 
> Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html