On Thu, Jan 26, 2012 at 11:40:47AM -0500, Loke, Chetan wrote: > > From: Andrea Arcangeli [mailto:aarcange@xxxxxxxxxx] > > Sent: January 25, 2012 5:46 PM > > .... > > > Way more important is to have feedback on the readahead hits and be > > sure when readahead is raised to the maximum the hit rate is near 100% > > and fallback to lower readaheads if we don't get that hit rate. But > > that's not a VM problem and it's a readahead issue only. > > > > A quick google showed up - http://kerneltrap.org/node/6642 > > Interesting thread to follow. I haven't looked further as to what was > merged and what wasn't. > > A quote from the patch - " It works by peeking into the file cache and > check if there are any history pages present or accessed." > Now I don't understand anything about this but I would think digging the > file-cache isn't needed(?). So, yes, a simple RA hit-rate feedback could > be fine. > > And 'maybe' for adaptive RA just increase the RA-blocks by '1'(or some > N) over period of time. No more smartness. A simple 10 line function is > easy to debug/maintain. That is, a scaled-down version of > ramp-up/ramp-down. Don't go crazy by ramping-up/down after every RA(like > SCSI LLDD madness). Wait for some event to happen. > > I can see where Andrew Morton's concerns could be(just my > interpretation). We may not want to end up like a protocol state machine > code: tcp slow-start, then increase , then congestion, then let's > back-off. hmmm, slow-start is a problem for my business logic, so let's > speed-up slow-start ;). Loke, Thrashing safe readahead can work as simple as: readahead_size = min(nr_history_pages, MAX_READAHEAD_PAGES) No need for more slow-start or back-off magics. This is because nr_history_pages is a lower estimation of the threshing threshold: chunk A chunk B chunk C head l01 l11 l12 l21 l22 | |-->|-->| |------>|-->| |------>| | +-------+ +-----------+ +-------------+ | | | # | | # | | # | | | +-------+ +-----------+ +-------------+ | | |<==============|<===========================|<============================| L0 L1 L2 Let f(l) = L be a map from l: the number of pages read by the stream to L: the number of pages pushed into inactive_list in the mean time then f(l01) <= L0 f(l11 + l12) = L1 f(l21 + l22) = L2 ... f(l01 + l11 + ...) <= Sum(L0 + L1 + ...) <= Length(inactive_list) = f(thrashing-threshold) So the count of continuous history pages left in inactive_list is always a lower estimation of the true thrashing-threshold. Given a stable workload, the readahead size will keep ramping up and then stabilize in range (thrashing_threshold/2, thrashing_threshold) Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html