Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)

Milosz Tanski <milosz@xxxxxxxxx> · Mon, 30 Mar 2015 19:06:45 -0400

On Mon, Mar 30, 2015 at 6:57 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 30 Mar 2015 18:49:06 -0400 Milosz Tanski <milosz@xxxxxxxxx> wrote:
>
>> > A fincore+pread solution that blocks is simply unsafe
>> > to use for us. We'll have to stay with the threadpool :-(.
>>
>> We're getting data from a network filesystem Ceph in our case, but it
>> could be pNFS. In many cases those filesystems have some kind
>> hierarchy and it's not uncommon for us to se requests that take 20 to
>> 25 milliseconds to complete. In this case the miss becomes very
>> expensive. And it's not just that one requests experiences the slow
>> down all the request being serviced by that (single) epoll thread
>> experience head-of-line blocking because of one stalled request.
>>
>> 10K request a second is a common load for many web services / video
>> servers servings chunks of data. If we experience one miss a second,
>> that 25 million stall will impact 250 other requests (all of them will
>> have a 25ms latency tacked on).
>
> I'd expect a fincore() which doesn't do SetPageReferenced() to be
> orders of magnitude better than this.  A fincore() which does use
> SetPageReferenced() will be in the "basically never happens" region -
> it would take massive and artificial memory stress to trigger.

I'm just responding to the upper bound you put out in an email a few
back of 0.0001% miss. And, people run web caches (like Apache Traffic
Server) at much higher rates than that.

-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html