readahead for strided IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I hope that you could help me out. We are currently investigating a performance issue involving a NFSv3 server (our applicance), and a Linux application doing IO against it.

 

The IO pattern are strictly sequential, but strided reads: the application requests 4k, skips 4k, reads 4k, skips 4k, … in a monotonic increasing pattern, and apparently using blocking read() calls. Unfortunately, I don’t know exactly, if the file handle was created using O_RDONLY or O_RDWR, and O_DIRECT or O_SYNC were specified.

 

As you can imagine, the RTT overhead (10s of usec per IO) of individual 4k NFS reads, which are issued by the NFS client only once the application actually requests them, is a severe limitation in terms of IOPS  (bandwidth is around 25-30MB/s, IOPS around 7000), even though the storage system / NFS server is detecting the strided reads and serving them directly from it’s pre-fetch cache (few usec latency there).

 

Complicating the issue is that the application behaving so inefficient is closed source. The best approaches would obviously be for the application to request larger blocks of data and once in application memory, discard about half of it (the strides are broken every ~20-30 IOs, and interspaced with 16k reads, followed by strided reads aligned to the other odd/even 4k block offsets in the file), or to explicitly make use of the readahead() facility of linux.

 

 

The reason I write this is my curiosity, if there would be any way to configure the linux readahead facitily to be really aggressive on a particular nfs mount; we checked the /sys/class/bdi settings for the mount in question, and increased the read_ahead_kb, but that didn’t change anything; I guess what would be necessary was a flag to have mm/readahead kick in for every read, regardless if it’s considered a sequential read or not…

 

Finally, are there ways to extract statistical information from mm/readahead, ie. if it was actually called (not that due to some flags used by the application, it’s completely bypassed to begin with), and when/why/how it decided to do the IO (or not) it does?

 

Thanks a lot!

 

 

 

Richard Scheffenegger

Storage Infrastructure Architect

 

NetApp Austria GmbH

+43 676 6543146 Tel

+43 1 3676811-3100 Fax

rs@xxxxxxxxxx

www.netapp.at


Unbound Cloud™ 

Die neue Vision des Cloud-
Datenmanagements

 

Description: Description: Description: Description: Description: Description: Facebook

Facebook

Description: Description: Description: Description: Description: Description: Twitter

Twitter

Description: Description: Description: Description: Description: Description: YouTube

YouTube

Description: Description: cid:image001.png@01CD2C88.A2795F10

 

 


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]